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VIRAL MATERIAL AND NUCLEOTIDE FRAGMENTS ASSOCIATED WITH 
MULTIPLE SCLEROSIS, FOR DIAGNOSTIC, PROPHYLACTIC AND 

THERAPEUTIC PURPOSES 

5 Multiple sclerosis (MS) is a demyelinating 

disease of the central nervous system (CNS) the cause of 
which remains as yet unknown. 

"Multiple sclerosis (MS) is the most common 
neurological disease of young adults with a prevalence in 

10 Europe and North America of between 20 and 200 per 
100,000. It is characterized clinically by a 
relapsing/remitting or chronic progressive course, 
frequently leading to severe disability. Current knowledge 
suggests that MS is associated with autoimmunity, that 

15 genetic background has an important influence and that 
"infectious" agent (s) may be involved. Indeed, many 
viruses have been proposed as possible candidates but as 
yet, none of them has been shown to play an aetiological 
role. 

20 Many studies have supported the hypothesis of a 

viral aetiology of the disease, but none of the known 
viruses tested has proved to be the causal agent sought: a 
rev iew of the viruses sought for several years in MS has 
been compiled by E. Norrby (1) and R.T. Johnson (2) . 

25 The discovery of pathogenic retroviruses in man 

(HTLVs and HIVs) was followed by great interest in their 
ability to impair the immune system and to provoke central 
nervous system inflammation and /or degeneration . In the 
case of HTLV-1, its association with a chronic 

30 inflammatory demyelinating disease in man (48) led to 
extensive investigations to search for an HTLVl-like 
retrovirus in MS patients. However, despite initial 
claims, the presence of HTLV-1 or HTLV-like retroviruses 
was not confirmed. 
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Recently, a retrovirus different from the known 
human retroviruses has been isolated in patients suffering 
from MS (3, 4, and 5). 

In 1989, the authors described the production of 
5 extracellular virions, associated with reverse 

transcriptase (RT) activity, by a culture of 
leptomeningeal cells (LM7) obtained from the cerebrospinal 
fluid of a patient with MS (3). This was followed by 
similar findings in monocyte cultures from a series of MS 

10 patients (5), Neither viral particles nor viral RT- 
activity were found in control individuals. Furthermore, 
the authors were able to transfer the LM7 virus to non- 
infected leptomeningeal cells in vitro (26) . The molecular 
characterization of the "LM7" retrovirus was a 

15 prerequisite for further evaluation of its possible role 
in MS. Considerable difficulties arose from the absence of 
continuously productive retroviral cultures and from the 
low levels of expression in the few transient cultures. 
The strategy described here focused on RNA from 

20 extracellular virions, in order to avoid non-specific 
detection of cellular RNA and of endogenous elements from 
contaminating human DNA. A specific retroviral sequence 
associated with virions produced by cell cultures from 
several MS patients has been identified. The entire 

25 sequence of this novel retroviral genome is currently 
being obtained using RT-PCR on RNA from extracellular 
virions. The retrovirus previously called "LM7 virus" 
corresponds to an oncovirus and is now designated MSRV 
(Multiple Sclerosis-associated Retrovirus) . 

30 The authors were also able to show that this 

retrovirus could be transmitted in vitro, that patients 
suffering from MS produced antibodies capable of 
recognizing proteins associated with the infection of 
leptomeningeal cells by this retrovirus, and that the 

35 expression of the latter could be strongly stimulated by 
the immediate-early genes of some herpesviruses (6) . 
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All these results point to the role in MS of at 
least one unknown retrovirus or of a virus having reverse 
transcriptase activity which is detectable according to 
the method published by H. Perron (3) and qualified as 
5 "LM7-like RT" activity. The content of the publication 
identified by ( 3 ) is incorporated in the present 
description by reference. 

Recently, the Applicant's studies have enabled 
two continuous cell lines infected with natural isolates 

10 originating from two different patients suffering from MS 
to be obtained by a culture method as described in the 
document WO-A-93/20188 , the content of which is incorpor- 
ated in the present description by reference. These two 
lines , derived from human choroid plexus cells , designated 

15 LM7PC and PLI-2, were deposited with the ECACC on 
22nd July 1992 and 8th January 1993, respectively, under 
numbers 92072201 and 93010817, in accordance with the 
provisions of the Budapest Treaty. Moreover, the viral 
isolates possessing LM7-like RT activity were also 

20 deposited with the ECACC under the overall designation of 
"strains". The "strain" or isolate harboured by the PLI-2 
line, designated POL-2, was deposited with the ECACC on 
22nd July 1992 under No. V92072202. The "strain" or 
isolate harboured by the LM7PC line , designated MS7PG, was 

25 deposited with the ECACC on 8th January 1993 under 
No. V93010816. 

Starting from the cultures and isolates 
mentioned above , characterized by biological and 
morphological criteria, the next step was to endeavour to 

3 0 characterize the nucleic acid material associated with the 
viral particles produced in these cultures. 

The portions of the genome which have already 
been characterized have been used to develop tests for 
molecular detection of the viral genome and 

35 immunoserological tests, using the amino acid sequences 
encoded by the nucleotide sequences of the viral genome, 
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in order to detect the immune response directed against 
epitopes associated with the infection and/or viral 
expression. 



5 to be confirmed between MS and the expression of the 
sequences identified in the patents cited later. However, 
the viral system discovered by the Applicant is related to 
a complex retroviral system. In effect, the sequences to 
be found encapsidated in the extracellular viral particles 

10 produced by the different cultures of cells of patients 
suffering from MS show clearly that there is 
coencapsidation of retroviral genomes which are related 
but different from the "wild-type" retroviral genome which 
produces the infective viral particles. This phenomenon 

15 has been observed between replicative retroviruses and 
endogenous retroviruses belonging to the same family, or 
even heterologous retroviruses. The notion of endogenous 
retroviruses is very important in the context of our 
discovery since, in the case of MSRV-1, it has been 

20 observed that endogenous retroviral sequences comprising 
sequences homologous to the MSRV-l genome exist in normal 
human DNA. The existence of endogenous retroviral elements 
(ERV) related to MSRV-1 by all or part of their genome 
explains the fact that the expression of the MSRV-l 

25 retrovirus in human cells is able to interact with closely 
related endogenous sequences. These interactions are to be 
found in the case of pathogenic and/or infectious 
endogenous retroviruses (for example some ecotropic 
strains of the murine leukaemia virus) , and in the case of 

30 exogenous retroviruses whose nucleotide sequence may be 
found partially or wholly, in the form of ERVs, in the 
host animal's genome (e.g. mouse exogenous mammary tumor 
virus transmitted via the milk) . These interactions 
consist mainly of (i) a trans-activation or coactivation 

35 of ERVs by the replicative retrovirus (ii) and 
"illegitimate" encapsidation of RNAs related to ERVS, or 



These tools have already enabled an association 
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of ERVs - or even of cellular RNAs - simply possessing 
compatible encapsidation sequences, in the retroviral 
particles produced by the expression of the replicative 
strain, which are sometimes transmissible and sometimes 
5 with a pathogenicity of their own, and (iii) more or less 
substantial recombinations between the coencapsidated 
genomes, in particular in the phases of reverse 
transcription, which lead to the formation of hybrid 
genomes, which are sometimes transmissible and sometimes 

10 with a pathogenicity of their own. 

Thus, (i) different sequences related to MSRV-1 
have been found in the purified viral particles; (ii) 
molecular analysis of the different regions of the MSRV-1 
retroviral genome should be carried out by systematically 

15 analyzing the coencapsidated, interfering and/or 
recombined sequences which are generated by the infection 
and/or expression of MSRV-1; furthermore, some clones may 
have defective sequence portions produced by the 
retroviral replication and template errors and/or errors 

20 of transcription of the reverse transcriptase; (iii) the 
families of sequences related to the same retroviral 
genomic region provide the means for an overall diagnostic 
detection which may be optimized by the identification of 
invariable regions among the clones expressed, and by the 

25 identification of reading frames responsible for the 
production of antigenic and/or pathogenic polypeptides 
which may be produced only by a portion, or even by just 
one, of the clones expressed, and, under these conditions, 
the systematic analysis of the clones expressed in the 

30 region of a given gene enables the frequency of variation 
and/or of recombination of the MSRV-1 genome in this 
region to be evaluated and the optimal sequences for the 
applications, in particular diagnostic applications, to be 
defined; (iv) the pathology caused by a retrovirus such as 

35 MSRV-1 may be a direct effect of its expression and of the 
proteins or peptides produced as a result, thereof, but 
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also an effect of the activation, the encapsidation or the 
recombination of related or heterologous genomes and of 
the proteins or peptides produced as a result thereof; 
thus, these genomes associated with the expression of 
5 and/or infection by MSRV-l are an integral part of the 
potential pathogenicity of this virus, and hence 
constitute means of diagnostic detection and special 
therapeutic targets. Similarly, any agent associated with 
or cofactor of these interactions responsible for the 

10 pathogenesis in question, such as MSRV-2 or the gliotoxic 
factor which are described in the patent application 
published under No. FR-2 , 716 , 198 , may participate in the 
development of an overall and very effective strategy for 
the diagnosis, prognosis, therapeutic monitoring and/or 

15 integrated therapy of MS in particular, but also of any 
other disease associated with the same agents . 

In this context, a parallel discovery has been 
made in another autoimmune disease, rheumatoid arthritis 
(RA) , which has been described in the French Patent 

20 Application filed under No. 95/02960. This discovery shows 
that, by applying methodological approaches similar to the 
ones which were used in the Applicant's work on MS, it was 
possible to identify a retrovirus expressed in RA which 
shares the sequences described for MSRV-l in MS, and also 

25 the coexistence of an associated MSRV-2 sequence also 
described in MS. As regards MSRV-l, the sequences detected 
in common in MS and RA relate to the pol and gag genes. In 
the current state of knowledge, it is possible to 
associate the gag and pol sequences described with the 

3 0 MSRV-l strains expressed in these two diseases ♦ 

The present patent application relates to 
various results which are additional to those already 
protected by the following French Patent Applications: 
- No. 92/04322 of 03.04.1992, published under 

35 No. 2,689,519; 



WO 98/23755 PCT/IB97/01482 



03 . 11. 1992 , 
03 . 11. 1992 , 
04 . 02 . 1994 , 
04 . 02 . 1994 , 
04 . 02 . 1994 , 
04 . 02 . 1994 , 
24.11.1994, 

23 . 12 . 1994 ; 



published 
published 
published 
published 
published 
published 
published 
published 



under 



under 



under 



under 



under 



under 



under 



- No. 92/13447 of 03.11.1992, published under 
No. 2 , 689, 521; 

- No. 92/13443 of 
No. 2,689,520; 

5 - No. 94/01529 of 
No. 2 ,715 , 936; 

- No. 94/01531 of 
No. 2,715,939; 

- No. 94/01530 of 
10 No. 2,715,936; 

- No. 94/01532 of 
No. 2,715,937; 

- No. 94/14322 of 
No. 2,727,428; 

15 - and No. 94/15810 of 
No. 2,728,585. 

The present invention relates, in the first 
place, to a viral material, in the isolated or purified 
state, which may be recognized or characterized in 

20 different ways: 

- its genome comprises a nucleotide sequence chosen from 
the group including the sequences SEQ ID NO: 46, SEQ ID 
NO:51, SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:56, SEQ ID 
NO: 58, SEQ ID NO: 59, SEQ ID NO: 60, SEQ ID NO: 61, SEQ ID 

25 NO: 89, their complementary sequences and their equivalent 
sequences, in particular nucleotide sequences displaying, 
for any succession of 100 contiguous monomers, at least 
50% and preferably at least 70% homology with the said 
sequences SEQ ID NO: 46, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID 

30 NO:53, SEQ ID NO:56, SEQ ID N0:58, SEQ ID NO:59, SEQ ID 
NO:60 SEQ ID N0:61, SEQ ID NO:89, respectively, and their 
complementary sequences ; 

- the region of its genome comprising the env and pol 
genes and a portion of the gag gene, excluding the 

35 subregion having a sequence identical or equivalent to 
SEQ ID N0:1, codes for any polypeptide displaying, for any 
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contiguous succession of at least 30 amino acids, at least 
50% and preferably at least 70% homology with a peptide 
sequence encoded by any nucleotide sequence chosen from 
the group including SEQ ID NO: 46, SEQ ID NO: 51, SEQ ID 
5 NO:52, SEQ ID NO:53, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 
NO: 59, SEQ ID NO: 60 SEQ ID NO: 61 SEQ ID NO: 89 and their 
complementary sequences ; 

- the pol gene comprises a nucleotide sequence partially 
or totally identical or equivalent to SEQ ID NO: 57 or SEQ 

10 ID NO: 93, excluding SEQ ID NO:l. 

- the gag gene comprises a nucleotide sequence partially 
or totally identical or equivalent to SEQ ID NO: 88. 

As indicated above, according to the present 
invention, the viral material as defined above is 

15 associated with MS, And as defined by reference to the pol 
or gag gene of MSRV-l, and more especially to the 
sequences SEQ ID NOS 51, 56, 57, 59, 60, 61, 88, 89, 93, 
169, 170, 171, 172, 176, 177, 178 and 179, this viral 
material is associated with RA. 

20 The present invention also relates to a nucleic 

material, in the isolated or purified state, having at 
least one of the following definitions : 

- a nucleic material comprising a nucleotide sequence 
selected from the group including sequences SEQ ID NO: 93, 

25 SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO: 170, SEQ ID N0:171, 
SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179, their complementary 

sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 

30 contiguous monomers, at least 50% and preferably at least 
60% homology with said sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO:171, 
SEQ ID NO:172, SEQ ID NO:176, SEQ ID NO:177, 

SEQ ID NO: 178, SEQ ID NO: 179, and their complementary 

35 sequences, excluding HSERV-9 (or ERV-9) ; advantageously, 
the nucleotide sequence of said nucleic material is 
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selected from the group including sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO:171, 
SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179, their complementary 

5 sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 70% and preferably at least 
80% homology with said sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO:171, 
10 SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179, and their complementary 
sequences ; 

- a nucleic material, in the isolated or purified state, 
coding for any polypeptide displaying, for any contiguous 

15 succession of at least 30 amino acids, at least 50%, 
preferably at least 60 %, and most preferably at least 70% 
homology with a peptide sequence encoded by any nucleotide 
sequence selected from the group including SEQ ID NO: 93, 
SEQ ID NO: 94, SEQ ID NO: 169, SEQ ID NO: 170, SEQ ID NO: 171, 

20 SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179 and their complementary 
sequences ; 

- a nucleic material, in the isolated or purified state, 
of retroviral type, comprising a nucleotide sequence 

25 identical or similar to at least part of the pol gene of 
an isolated retrovirus associated with multiple sclerosis 
or rheumatoid arthritis; advantageously, said nucleotide 
sequence is 80 % similar to said at least part of the gene 
pol; 

30 - a nucleic material comprising a nucleotide sequence 
identical or similar to at least part of the pol gen of an 
isolated virus encoding a reverse transcriptase having a 
enzymatic site comprised between the amino acid domains 
LPQG-YXDD , having a phylogenic distance with HSERV-9 of 

35 0,063 + 0.1, and preferably 0.063 ± 0.05; the phylogenic 
distances are calculated on the basis of a reference 
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sequence according to UPGM tree option of the Geneworks™ 
Software ( INTELLIGENETICS) ; 

By enzymatic site, we understand the amino acids domain (s) 
conferring the specific activity of a given enzyme. 
5 The present invention also relates to different 

nucleotide fragments each comprising a nucleotide sequence 
chosen from the group including: 

(a) all the genomic sequences, partial and total, of the 
pol gene of the MSRV-1 virus, except for the total 

10 sequence of the nucleotide fragment defined by 
SEQ ID NO: 1; 

(b) all the genomic sequences, partial and total, of the 
env gene of MSRV-1; 

(c) all the partial genomic sequences of the gag gene of 
15 MSRV-1; 

(d) all the genomic sequences overlapping the pol gene and 
the env gene of the MSRV-1 virus, and overlapping the pol 
gene and the gag gene; 

(e) all the sequences, partial and total, of a clone 
20 chosen from the group including the clones FBd3 

(SEQ ID NO:46), t pol (SEQ ID NO:51), JLBcl 

(SEQ ID NO:52), JLBc2 (SEQ ID NO:53) and GM3 

(SEQ ID NO:56), FBdl3 (SEQ ID NO:58), LB19 (SEQ ID NO:59), 

LTRGAG12 (SEQ ID NO:60), FP6 (SEQ ID NO:61), G+E+A 
25 (SEQ ID NO:89), excluding any nucleotide sequence 

identical to or lying within the sequence defined by 

SEQ ID NO: 1; 

(f) sequences complementary to the said genomic sequences; 

(g) sequences equivalent to the said sequences (a) to (e) , 
30 in particular nucleotide sequences displaying, for any 

succession of 100 contiguous monomers, at least 50% and 
preferably at least 70% homology with the said sequences 
(a) to (d) , 

provided that this nucleotide fragment does not comprise 
35 or consist of the sequence ERV-9 as described in LA MANTIA 
et al. (18) . 
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The term genomic sequences, partial or total, 
includes all sequences associated by coencapsidation or by 
coexpression , or recombined sequences . 

Preferably, such a fragment comprises: 
5 - either a nucleotide sequence identical to a partial or 
total genomic sequence of the pol gene of the MSRV-1 
virus, except for the total sequence of the nucleotide 
fragment defined by SEQ ID NO:l, or identical to any 
sequence equivalent to the said partial or total genomic 
10 sequence, in particular one which is homologous to the 
latter; 

- or a nucleotide sequence identical to a partial or total 
genomic sequence of the env gene of the MSRV-1 virus, or 
identical to any sequence complementary to the said~~ 

15 nucleotide sequence, or identical to any sequence 

equivalent to the said nucleotide sequence, in particular 

one which is homologous to the latter. 

In particular, the invention relates to a 

nucleotide fragment comprising a coding nucleotide 
20 sequence which is partially or totally identical to a 

nucleotide sequence chosen from the group including: 

- the nucleotide sequence defined by SEQ ID NO: 40, SEQ ID 
NO: 62 or SEQ ID NO: 89; 

- sequences complementary to SEQ ID NO: 40, SEQ ID NO: 62 or 
25 SEQ ID NO: 89; 

- sequences equivalent, and in particular homologous to 
SEQ ID NO: 40, SEQ ID NO: 62 or SEQ ID NO: 89; 

- sequences coding for all or part of the peptide sequence 
defined by SEQ ID NO:39, SEQ ID NO:63 or SEQ ID NO:90; 

30 - sequences coding for all or part of a peptide sequence 
equivalent, in particular homologous to SEQ ID NO: 39, SEQ 
ID NO: 63 or SEQ ID NO:90, which is capable of being 
recognized by sera of patients infected with the MSRV-1 
virus, or in whom the MSRV-1 virus has been reactivated. 
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The invention also relates to a nucleotide 
fragment (called fragment I) having at least one of the 
following definitions : 

- a nucleotide fragment comprising a nucleotide sequence 
5 selected from the group including SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO : 169 , SEQ ID NO:170, SEQ ID N0:171, 
SEQ ID NO:172, SEQ ID NO:176, SEQ ID NO:177, 

SEQ ID NO: 178, SEQ ID NO: 179, their complementary 

sequences and their equivalent sequences, in particular 
10 nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 50% and preferably at least 
60% homology with said sequences and their complementary 
sequences, said group excluding SEQ ID NO:l, 

said nucleotide fragment not comprising nor consisting of 

15 the sequence HSERV-9 (or ERV-9) ; preferably the nucleotide 
sequence of said fragment is selected from the group 
including SEQ ID NO: 93, SEQ ID NO: 94, SEQ ID NO: 169, 
SEQ ID NO: 170, SEQ ID NO: 171, SEQ ID NO: 172, 

SEQ ID NO: 176, SEQ ID NO:177, SEQ ID NO:178, 

20 SEQ ID NO: 179, their complementary sequences and their 
equivalent sequences, in particular nucleotide sequences 
displaying, for any succession of 100 contiguous monomers, 
at least 70% and preferably at least 80% homology with 
said sequences and their complementary sequences; 

25 - a nucleotide fragment comprising a coding nucleotide 
sequence which is partially or totally identical to a 
nucleotide sequence selected from the group including : 

SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:169, 

SEQ ID NO: 170, SEQ ID NO: 171, SEQ ID NO: 172, 

30 SEQ ID NO: 176, SEQ ID NO: 177, SEQ ID NO: 178, 

SEQ ID NO: 179 ; their complementary sequences ; their 
equivalent sequences, in particular homologous to 
SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO: 169, SEQ ID NO:170, 
SEQ ID NO: 171, SEQ ID NO: 172, SEQ ID NO: 176, 

35 SEQ ID NO: 177, SEQ ID NO: 178, SEQ ID NO: 179; 
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sequences encoding all or parts of the peptide 
sequence defined by SEQ ID NO: 95, SEQ ID NO: 173, 
SEQ ID NO:174, SEQ ID NO:175, SEQ ID NO:180, 

SEQ ID NO: 181, SEQ ID NO: 182; 
5 sequences encoding all or parts of a peptide 

sequence equivalent, in particular homologous to 
SEQ ID NO:95, SEQ ID NO:173, SEQ ID NO:174, SEQ ID NO:175, 
SEQ ID NO:180, SEQ ID N0:181, SEQ ID NO:182, which is 
capable of being recognized by sera of patients infected 

10 with the MSRV-1 virus, or in whom the MSRV-1 virus has 
been reactivated. 

The invention also relates to any nucleic acid 
probe for the detection of virus associated with MS and/or 
rheumatoid arthritis (RA) , which is capable of hybridizing. 

15 specifically with any fragment such as is defined above, 
belonging or lying within the genome of the said 
pathogenic agent. It relates, in addition, to any nucleic 
acid probe for detection of a pathogenic and/or infective 
agent associated with RA, which is capable of hybridizing 

20 specifically with any fragment as defined above by 
reference to the pol and gag genes, and especially with 
respect to the sequences SEQ ID NOS 40, 51, 56, 59, 60, 
61, 62, 89 and SEQ ID NOS 39, 63 and 90. 

The invention also relates to a primer for the 

25 amplification by polymerization of an RNA or a DNA of a 
viral material, associated with MS and/or RA, comprising a 
nucleotide sequence identical or equivalent to at least 
one portion of the nucleotide sequence of any fragment 
such as is defined above, in particular a nucleotide 

30 sequence displaying, for any succession of at least 10 
contiguous monomers, preferably 15 contiguous monomers, 
more preferably 18 contiguous monomers and even most 
preferably 20 contiguous monomers, at least 70% homology 
with at least the said portion of the said fragment. 

35 Preferably, the nucleotide sequence of such a primer is 
identical to any one of the sequences selected from the 
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group including SEQ ID NO: 47 to SEQ ID NO: 50, 

SEQ ID NO: 55, SEQ ID NO: 64, SEQ ID NO: 86, SEQ ID NO: 99 to 
SEQ ID NO: 111, SEQ ID NO: 183, SEQ ID NO: 184, 

SEQ ID NO: 185, SEQ ID NO: 186. 
5 Generally speaking the invention also 

encompasses any RNA or DNA, and in particular replication 
vector, comprising a genomic fragment of the viral 
material such as is defined above, or a nucleotide 
fragment such as is defined above. 

10 The invention also relates to the different 

peptides encoded by any open reading frame belonging to a 
nucleotide fragment such as is defined above, in 
particular any polypeptide, for example any oligopeptide 
forming or comprising an antigenic determinant recognized 

15 by sera of patients infected with the MSRV-1 virus and/or 
in whom the MSRV-1 virus has been reactivated. Preferably, 
this polypeptide is antigenic, and is encoded by the open 
reading frame beginning, in the 5' -3 1 direction, at 
nucleotide 181 and ending at nucleotide 330 of 

20 SEQ ID N0:l. 

The invention also encompasses the following 
polypeptides : 
a) 

- a polypeptide encoded by any open reading frame 
25 belonging to a nucleotide fragment, fragment I, as defined 

above ; 

- a polypeptide, characterized in that the open reading 
frame encoding it, is comprised, in the 5 , -3 l direction, 
between nucleotide 18 and nucleotide 2304 of SEQ ID NO: 93; 

30 - a polypeptide, having a peptide sequence comprising a 
sequence partially or totally identical to SEQ ID NO: 95; 
b) 

- a polypeptide, recombinant or synthetic, having a 
peptide sequence which comprises a sequence identical or 

35 equivalent to SEQ ID NO: 96; in particular said polypeptide 
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15 



exhibits an enzymatic activity consisting of proteolytic 
activity; 

- a polypeptide, recombinant or synthetic, characterized 
in that the open reading frame encoding it begins, in the 

5 5 1 -3 1 direction, at nucleotide 18 and ends at nucleotide 
340 of SEQ ID NO: 93; 

- a polypeptide having an inhibitory activity on the 
proteolytic activity of a polypeptide as defined according 
to b) ; 



- a polypeptide, recombinant or synthetic, having a 
peptide sequence which comprises a sequence identical or 
equivalent to SEQ ID NO: 97; in particular said polypeptide 
exhibits a reverse transcriptase activity; 

15 - a polypeptide having a peptide sequence which comprises/ 
a sequence identical or equivalent to SEQ ID NO: 98; in 
particular said polypeptide exhibits a ribonuclease, 
activity; 

- a polypeptide, recombinant or synthetic, characterized 
20 in that the open reading frame encoding it begins, in the 

S'-O 1 direction, at nucleotide 341 and ends at nucleotide. 
2304 of SEQ ID NO: 93; 

- a polypeptide, recombinant or synthetic, characterized 
in that the open reading frame encoding it begins, in the 

25 5* -3* direction, at nucleotide 1858 and ends at nucleotide 
2304 Of SEQ ID NO:93. 

- a polypeptide having an inhibitory activity on the 
reverse transcriptase activity of a polypeptide as defined 
according to c) or on the ribonuclease H activity of a 

30 polypeptide as defined according to c) . 

In particular, the invention relates to an 
antigenic polypeptide recognized by the sera of patients 
infected with the MSRV-1 virus, and/ or in whom the MSRV-1 
virus has been reactivated, whose peptide sequence is 

35 partially or totally identical or is equivalent to the 



10 c) 



sequence 



defined 



by 



SEQ ID NO: 39, 



SEQ ID NO: 63, 
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SEQ ID NO: 87, SEQ ID NO: 95, SEQ ID NO: 96, SEQ ID NO: 97, 
SEQ ID NO:98, SEQ ID NO:173, SEQ ID NO:174, SEQ ID NO:175, 
SEQ ID NO: 180, SEQ ID NO: 181 and SEQ ID NO: 182; such a 
sequence is identical, for example, to any sequence 
5 selected from the group including the sequences 
SEQ ID NO: 41 to SEQ ID NO: 44, SEQ ID NO: 63 and 

SEQ ID NO: 87. 

The present invention also proposes mono- or 
polyclonal antibodies directed against the MSRV-1 virus, 
10 which are obtained by the immunological reaction of a 
human or animal body or cells to an immunogenic agent 
consisting of an antigenic polypeptide such as is defined 
above . 

The invention next relates to: 

15 - reagents for detection of the MSRV- virus, or of an 
exposure to the latter, comprising, at least one reactive 
substance selected from the group consisting of a probe of 
the present invention, a polypeptide, in particular an 
antigenic peptide, such as is defined above, or an anti- 

20 ligand, in particular an antibody to the said polypeptide; 
- all diagnostic, prophylactic or therapeutic compositions 
comprising one or more peptides, in particular antigenic 
peptides, such as are defined above, or one or more anti- 
ligands, in particular antibodies to the peptides, 

25 discussed above; such a composition is preferably, and by 
way of example , a vaccine composition . 

The invention also relates to any diagnostic, 
prophylactic or therapeutic composition, in particular for 
inhibiting the expression of at least one virus associated 

30 with MS or RA, and/or the enzymatic activities of the 
proteins of said virus, comprising a nucleotide fragment 
such as is defined above or a polynucleotide, in 
particular oligonucleotide, whose sequence is partially 
identical to that of the said fragment, except for that of 

35 the fragment having the nucleotide sequence SEQ ID N0:1. 
Likewise, it relates to any diagnostic, prophylactic or 
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therapeutic composition, in particular for inhibiting the 
expression of at least one pathogenic and/or infective 
agent associated with RA, comprising a nucleotide fragment 
such as is defined above by reference to the pol and gag 
5 genes, and especially with respect to the sequences 
SEQ ID NOS 40, 51, 56, 59, 60, 61, 62 and 89. 

According to the invention, these same fragments 
or polynucleotides, in particular oligonucleotides, may 
participate in all suitable compositions for detecting, 

10 according to any suitable process or method, a patho- 
logical and/or infective agent associated with MS and with 
RA, respectively, in a biological sample. In such a 
process, an RNA and/or a DNA presumed to belong or 
originating from the said pathological and/or infective 

15 agent, and/or their complementary RNA and/or DNA, is/are 
brought into contact with such a composition. 

The present invention also relates to any 
process for detecting the presence or exposure to such a 
pathological and/or infective agent, in a biological 

20 sample, by bringing this sample into contact with a 
peptide, in particular an antigenic peptide such as is 
defined above, or an anti-ligand, in particular an anti- 
body to this peptide, such as is defined above. 

In practice, and for example, a device for 

25 detection of the MSRV-1 virus comprises a reagent such as 
is defined above, supported by a solid support which is 
immunologically compatible with the reagent, and a means 
for bringing the biological sample, for example a sample 
of blood or of cerebrospinal fluid, likely to contain 

30 anti-MSRV-1 antibodies, into contact with this reagent 
under conditions permitting a possible immunological 
reaction, the foregoing items being accompanied by means 
for detecting the immune complex formed with this reagent. 

Lastly, the invention also relates to the detec- 

35 tion of anti-MSRV-1 antibodies in a biological sample, for 
example a sample of blood or of cerebrospinal fluid, 
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according to which this sample is brought into contact 
with a reagent such as is defined above, consisting of an 
antibody, under conditions permitting their possible 
immunological reaction, and the presence of the immune 
5 complex thereby formed with the reagent is then detected. 

Before describing the invention in detail, 
different terms used in the description and the claims are 
now defined: 

- strain or isolate is understood to mean any 
10 infective and/or pathogenic biological fraction contain- 
ing, for example, viruses and/or bacteria and/or para- 
sites, generating pathogenic and/or antigenic power, 
harboured by a culture or a living host; as an example, a 
viral strain according to the above definition can contain 

15 a coinfective agent, for example a pathogenic protist, 

- the term "MSRV" used in the present 
description denotes any pathogenic and/or infective agent 
associated with MS, in particular a viral species, the 
attenuated strains of the said viral species or the 

20 defective-interfering particles or particles containing 
coencapsidated genomes, or alternatively genomes 
recombined with a portion of the MSRV-l genome, derived 
from this species. Viruses, and especially viruses 
containing RNA, are known to have a variability resulting, 

25 in particular, from relatively high rates of spontaneous 
mutation (7) , which will be borne in mind below for 
defining the notion of equivalence, 

- human virus is understood to mean a virus 
capable of infecting, or of being harboured by human 

30 beings, 

- in view of all the natural or induced vari- 
ations and/or recombination which may be encountered when 
implementing the present invention, the subjects of the 
latter, defined above and in the claims, have been 

35 expressed including the equivalents or derivatives of the 
different biological materials defined below, in 
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particular of the homologous nucleotide or peptide 
sequences , 

- the variant of a virus or of a pathogenic 
and/or infective agent according to the invention 

5 comprises at least one antigen recognized by at least one 
antibody directed against at least one corresponding 
antigen of the said virus and/or said pathogenic and/or 
infective agent, and/or a genome any part of which is 
detected by at least one hybridization probe and/or at 

10 least one nucleotide amplification primer specific for the 
said virus and/or pathogenic and/or infective agent, such 
as, for example, for the MSRV-1 virus, the primers and 
probes having a nucleotide sequence chosen from 
SEQ ID NO: 20 to SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 16, 

15 to SEQ ID NO: 19, SEQ ID NO: 31 to SEQ ID NO: 33,: 

SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49, 
SEQ ID NO: 50, SEQ ID NO: 45 and their complementary 
sequences, under particular hybridization conditions well 
known to a person skilled in the art, 

20 - according to the invention, a nucleotide 

fragment or an oligonucleotide or polynucleotide is an, 
arrangement of monomers, or a biopolymer, characterized by 
the informational sequence of the natural nucleic acids, 
which is capable of hybridizing with any other nucleotide 

25 fragment under predetermined conditions, it being possible 
for the arrangement to contain monomers of different 
chemical structures and to be obtained from a molecule of 
natural nucleic acid and/or by genetic recombination 
and/or by chemical synthesis; a nucleotide fragment may be 

3 0 identical to a genomic fragment of the MSRV-1 virus 
discussed in the present invention, in particular a gene 
of this virus, for example pol or env in the case of the 
said virus, 

- thus, a monomer can be a natural nucleotide of 
35 nucleic acid whose constituent elements are a sugar, a 

phosphate group and a nitrogenous base; in RNA the sugar 
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is ribose, in DNA the sugar is 2-deoxyr ibose ; depending on 
whether the nucleic acid is DNA or RNA , the nitrogenous 
base is chosen from adenine, guanine, uracil, cytosine and 
thymine; or the nucleotide can be modified in at least one 
5 of the three constituent elements; as an example, the 
modification can occur in the bases, generating modified 
bases such as inosine, 5-methyldeoxycytidine, 

deoxyuridine, 5- (dimethylamino) deoxyuridine, 2 , 6- 

diaminopurine, 5-bromodeoxyur idine and any other modified 

10 base promoting hybridization; in the sugar, the 
modification can consist of the replacement of at least 
one deoxyr ibose by a polyamide (8) , and in the phosphate 
group , the modif ication can consist of its replacement by 
esters chosen, in particular, from diphosphate, alkyl- and 

15 arylphosphonate and phosphorothioate esters, 

- "informational sequence 11 is understood to mean 
any ordered succession of monomers whose chemical nature 
and order in a reference direction constitute or otherwise 
an item of functional information of the same quality as 

20 that of the natural nucleic acids, 

- hybridization is understood to mean the 
process during which, under suitable working conditions, 
two nucleotide fragments having sufficiently complementary 
sequences pair to form a complex structure , in particular 

25 double or triple, preferably in the form of a helix, 

- a probe comprises a nucleotide fragment syn- 
thesized chemically or obtained by digestion or enzymatic 
cleavage of a longer nucleotide fragment, comprising at 
least six monomers, advantageously from 10 to 1000 mono- 

30 mers , preferably 10 to 3 0 monomers and more preferably 18 
to 30, and possessing a specificity of hybridization under 
particular conditions; preferably, a probe possessing 
fewer than 10 monomers, but preferably fewer than 15 
monomers is not used alone, but is used in the presence of 

35 other probes of equally short size or otherwise; under 
certain special conditions, it may be useful to use probes 
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of size greater than 100 monomers; a probe may be used, in 
particular, for diagnostic purposes, such molecules being, 
for example, capture and/or detection probes, 

- the capture probe may be immobilized on a 
5 solid support by any suitable means, that is to say 

directly or indirectly, for example by covalent bonding or 
passive adsorption, 

- the detection probe may be labelled by means 
of a label chosen, in particular, from radioactive 

10 isotopes, enzymes chosen, in particular, from peroxidase 
and alkaline phosphatase and those capable of hydrolysing 
a chromogenic, fluorogenic or luminescent substrate, 
chromophoric chemical compounds, chromogenic, fluorogenic 
or luminescent compounds, nucleotide base analogues and. fc 

15 biotin, 

- the probes used for diagnostic purposes of the 
invention may be employed in all known hybridization 
techniques, and in particular the techniques termed "DOT- 
BLOT" (9), "SOUTHERN BLOT" (10), "NORTHERN BLOT", which is 

20 a technique identical to the "SOUTHERN BLOT" technique but 
which uses RNA as target, and the SANDWICH technique (11) ; 
advantageously, the SANDWICH technique is used in the 
present invention, comprising a specific capture probe 
and/or a specific detection probe, on the understanding 

2 5 that the capture probe and the detection probe must 
possess an at least partially different nucleotide 
sequence , 

- any probe according to the present invention 
can hybridize in vivo or in vitro with RNA and/or with DNA 

30 in order to block the phenomena of replication, in 
particular translation and/or transcription, and/or to 
degrade the said DNA and/or RNA, 

- a primer is a probe comprising at least six 
monomers, and advantageously from 10 to 3 0 monomers, and 

35 preferably from 18 to 25 monomers, possessing a 
specificity of hybridization under particular conditions 
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for the initiation of an enzymatic polymerization, for 
example in an amplification technique such as PCR 
(polymerase chain reaction) , in an elongation process such 
as sequencing, in a method of reverse transcription or the 
5 like, 

- two nucleotide or peptide sequences are termed 
equivalent or derived with respect to one another, or with 
respect to a reference sequence, if functionally the 
corresponding biopolymers can perform substantially the 

10 same role, without being identical, as regards the 
application or use in question, or in the technique in 
which they participate; two sequences are, in particular, 
equivalent if they are obtained as a result of natural 
variability, in particular spontaneous mutation of the 

15 species from which they have been identified, or induced 
variability, as are two homologous sequences, homology 
being defined below, 

- "variability" is understood to mean any 
spontaneous or induced modification of a sequence, in par- 

20 ticular by substitution and/or insertion and/or deletion 
of nucleotides and/or of nucleotide fragments, and/or 
extension and/or shortening of the sequence at one or both 
ends; an unnatural variability can result from the genetic 
engineering techniques used, for example the choice of 

25 synthesis primers, degenerate or otherwise, selected for 
amplifying a nucleic acid; this variability can manifest 
itself in modifications of any starting sequence, 
considered as reference, and capable of being expressed by 
a degree of homology relative to the said reference 

3 0 sequence, 

- homology characterizes the degree of identity 
of two nucleotide or peptide fragments compared; it is 
measured by the percentage identity which is determined, 
in particular, by direct comparison of nucleotide or 

35 peptide sequences, relative to reference nucleotide or 
peptide sequences , 
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- this percentage identity has been specifically 
determined for the nucleotide fragments, clones in 
particular, dealt with in the present invention, which are 
homologous to the fragments identified, for the MSRV-1 

5 virus, by SEQ ID NO:l to N0:9, SEQ ID NO:46, SEQ ID NO:51 
to SEQ ID NO:53, SEQ ID NO:40, SEQ ID NO:56, SEQ ID NO:57 
and SEQ ID NO: 93, as well as for the probes and primers 
homologous to the probes and primers identified by SEQ ID 
N0:20 to SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:16 to SEQ 

10 ID N0:19, SEQ ID NO:31 to SEQ ID NO:33, SEQ ID NO:45, SEQ 
ID NO:47, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:50, SEQ ID 
NO:55, SEQ ID NO:40, SEQ ID NO:56, SEQ ID NO:57 and SEQ ID 
NO: 99 to SEQ ID NO: 111; as an example, the smallest 
percentage identity observed between the different generals 

15 consensus sequences of nucleic acids obtained from*: 
fragments of MSRV-1 viral RNA, originating from the LM7PC 
and PLI-2 lines according to a protocol detailed later, is 
67% in the region described in Figure 1, 

- any nucleotide fragment is termed equivalent 
20 or derived from a reference fragment if it possesses a 

nucleotide sequence equivalent to the sequence of the 
reference fragment; according to the above definition, the" 
following in particular are equivalent to a reference 
nucleotide fragment : 
25 a) any fragment capable of hybridizing at least 

partially with the complement of the reference fragment, 

b) any fragment whose alignment with the refer- 
ence fragment results in the demonstration of a larger 
number of identical contiguous bases than with any other 

30 fragment originating from another taxonomic group, 

c) any fragment resulting, or capable of result- 
ing, from the natural variability of the species from 
which it is obtained, 

d) any fragment capable of resulting from the 
35 genetic engineering techniques applied to the reference 

fragment, 
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e) any fragment containing at least eight 
contiguous nucleotides encoding a peptide which is 
homologous or identical to the peptide encoded by the 
reference fragment , 
5 f) any fragment which is different from the 

reference fragment by insertion, deletion or substitution 
of at least one monomer , or extension or shortening at one 
or both of its ends; for example, any fragment 
corresponding to the reference fragment flanked at one or 
10 both of its ends by a nucleotide sequence not coding for a 
polypeptide, 

- polypeptide is understood to mean , in particu- 
lar, any peptide of at least two amino acids, in particu- 
lar an oligopeptide, or protein, and for example an 

15 enzyme, extracted, separated or substantially isolated or 
synthesized through human intervention, in particular 
those obtained by chemical synthesis or by expression in a 
recombinant organism, 

- polypeptide partially encoded by a nucleotide 
20 fragment is understood to mean a polypeptide possessing at 

least three amino acids encoded by at least nine 
contiguous monomers lying within the said nucleotide 
fragment, 

- an amino acid is termed analogous to another 
25 amino acid when their respective physicochemical prop- 
erties, such as polarity, hydrophobicity and/or basicity 
and/or acidity and/or neutrality are substantially the 
same; thus, a leucine is analogous to an isoleucine, 

- any polypeptide is termed equivalent or 
30 derived from a reference polypeptide if the polypeptides 

compared have substantially the same properties, and in 
particular the same antigenic, immunological, 
enzymological and/ or molecular recognition properties ; the 
following in particular are equivalent to a reference 
35 polypeptide : 
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a) any polypeptide possessing a sequence in 
which at least one amino acid has been replaced by an 
analogous amino acid, 

b) any polypeptide having an equivalent peptide 
5 sequence, obtained by natural or induced variation of the 

said reference polypeptide and/or of the nucleotide 
fragment coding for the said polypeptide, 

c) a mimotope of the said reference polypeptide, 

d) any polypeptide in whose sequence one or more 
10 amino acids of the L series are replaced by an amino acid 

of the D series, and vice versa, 

e) any polypeptide into whose sequence a modifi- 
cation of the side chains of the amino acids has been 
introduced, such as, for example, an acetylation of the — 

15 amine functions, a carboxylation of the thiol functions, 
an ester if icat ion of the carboxyl functions, 

f) any polypeptide in whose sequence one or more 
peptide bonds have been modified, such as, for example, 
carba, retro, inverso, retro-inverso , reduced and methy- 

2 0 lenoxy bonds, 

(g) any polypeptide at least one antigen of 
which is recognized by an antibody directed against a 
reference polypeptide , 

- the percentage identity characterizing the 
25 homology of two peptide fragments compared is, according 
to the present invention, at least 50% and preferably at 
least 70%. 

In view of the fact that a virus possessing 
reverse transcriptase enzymatic activity may be geneti- 

30 cally characterized equally well in RNA and in DNA form, 
both the viral DNA and RNA will be referred to for 
characterizing the sequences relating to a virus possess- 
ing such reverse transcriptase activity, termed MSRV-1 
according to the present description. 

35 The expressions of order used in the present 

description and the claims, such as "first nucleotide 
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sequence", are not adopted so as to express a particular 
order, but so as to define the invention more clearly. 

Detection of a substance or agent is understood 
below to mean both an identification and a quantification, 
5 or a separation or isolation, of the said substance or 
said agent. 

A better understanding of the invention will be 
gained on reading the detailed description which follows, 
prepared with reference to the attached figures, in which: 

10 - Figure 1 shows general consensus sequences of 

nucleic acids of the MSRV-1B clones amplified by the PCR 
technique in the "pol" region defined by Shih (12) , from 
viral DNA originating from the LM7PC and PLI-2 lines, and 
identified under the references SEQ ID NO: 3, SEQ ID NO: 4, 

15 SEQ ID NO: 5 and SEQ ID NO: 6, and the common consensus with 
amplification primers bearing the reference SEQ ID NO: 7; 

- Figure 2 gives the definition of a functional 
reading frame for each MSRV-1B/ "PCR pol" type family, the 
said families A to D being defined, respectively, by the 

2 0 nucleotide sequences SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 
and SEQ ID NO: 6 described in Figure 1; 

- Figure 3 gives an example of consensus of the 
MSRV-2B sequences, identified by SEQ ID NO: 11; 

- Figure 4 is a representation of the reverse 
25 transcriptase (RT) activity in dpm (disintegrations per 

minute) in the sucrose fractions taken from a purification 
gradient of the virions produced by the B lymphocytes in 
culture from a patient suffering from MS; 

- Figure 5 gives, under the same experimental 
30 conditions as in Figure 4, the assay of the reverse 

transcriptase activity in the culture of a B lymphocyte 
line obtained from a control free from MS; 

- Figure 6 shows the nucleotide sequence of the 
clone PSJ17 (SEQ ID NO:9); 

35 - Figure 7 shows the nucleotide sequence SEQ ID 

NO: 8 of the clone designated M003-P004; 



WO 98/23755 PCT/IB97/01482 



- Figure 8 shows the nucleotide sequence SEQ ID 
NO: 2 of the clone Fll-1; the portion located between the 
two arrows in the region of the primer corresponds to a 
variability imposed by the choice of primer which was used 

5 for the cloning of Fll-1; in this same figure, the 
translation into amino acids is shown; 

- Figure 9 shows the nucleotide sequence SEQ ID 
NO:l, and a possible functional reading frame of SEQ ID 
NO:l in terms of amino acids; on this sequence, the 

10 consensus sequences of the pol gene are underlined; 

- Figures 10 and 11 give the results of a PCR, 
in the form of a photograph under ultraviolet light of an 
ethidium bromide- impregnated agarose gel , of the amplifi- 
cation products obtained from the primers identified by — 

15 SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 19; 

- Figure 12 gives a representation in matrix 
form of the homology between SEQ ID NO:l of MSRV-1 and 
that of an endogenous retrovirus designated HSERV9; this 
homology of at least 65% is demonstrated by a continuous 

20 line, the absence of a line meaning a homology of less 
than 65%; 

- Figure 13 shows the nucleotide sequence SEQ ID 
NO: 46 of the clone FBd3 ; 

- Figure 14 shows the sequence homology between 
25 the clone FBd3 and the HSERV-9 retrovirus; 

- Figure 15 shows the nucleotide sequence SEQ ID 
NO: 51 of the clone t pol; 

- Figures 16 and 17 show, respectively, the 
nucleotide sequences SEQ ID NO: 52 and SEQ ID NO: 53 of the 

3 0 clones JLBcl and JLBc2 , respectively; 

- Figure 18 shows the sequence homology between 
the clone JLBcl and the clone FBd3 ; 

- and Figure 19 the sequence homology between 
the clone JLBc2 and the clone FBd3 ; 

35 - Figure 2 0 shows the sequence homology between 

the clones JLBcl and JLBc2 ; 
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- Figures 21 and 2 2 show the sequence homology 
between the HSERV-9 retrovirus and the clones JLBcl and 
JLBc2 , respectively ; 

- Figure 2 3 shows the nucleotide sequence SEQ ID 
5 NO: 56 of the clone GM3 ; 

- Figure 2 4 shows the sequence homology between 
the HSERV-9 retrovirus and the clone GM3 ; 

- Figure 2 5 shows the localization of the 
different clones studied, relative to the genome of the 

10 known retrovirus ERV9 ; 

- Figure 26 shows the position of the clones 
Fll-1, M003-P004, MSRV-1B and PSJ17 in the region 
hereinafter designated MSRV-1 pol*; 

- Figure 27, split into three successive Figures 
15 27a-27c, shows a possible reading frame covering the whole 

of the pol gene; 

- Figure 28 shows, according to SEQ ID NO: 4 0 , 
the nucleotide sequence coding for the peptide fragment 
POL2B, having the amino acid sequence identified by SEQ ID 

20 NO:39; 

- Figure 29 shows the OD values (ELISA tests) at 
492 nm obtained for 29 sera of MS patients and 32 sera of 
healthy controls tested with an anti-IgG antibody; 

- Figure 30 shows the OD values (ELISA tests) at 
25 492 nm obtained for 36 sera of MS patients and 42 sera of 

healthy controls tested with an anti-IgM antibody; 

- Figures 31 to 3 3 show the results obtained 
(relative intensity of the spots) for 43 overlapping 
octapeptides covering the amino acid sequence 61-110, 

30 according to the Spotscan technique, respectively with a 
pool of MS sera, with a pool of control sera and with the 
pool of MS sera after deduction of a background corre- 
sponding to the maximum signal detected on at least one 
octapeptide with the control serum (intensity = 1) , on the 

35 understanding that these sera were diluted to 1/50. The 
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bar at the far right-hand end represents a graphic scale 
standard unrelated to the serological test; 

- Figure 3 4 shows the SEQ ID NO: 41 and SEQ ID 
NO: 42 of two polypeptides comprising immunodominant 

5 regions, while SEQ ID NO: 43 and 44 represent 
immunoreactive polypeptides specific to MS; 

- Figure 3 5 shows the nucleotide sequence SEQ ID 
NO: 59 of the clone LB19 and three potential reading frames 
of SEQ ID NO: 59 in terms of amino acids; 

10 - Figure 3 6 shows the nucleotide sequence SEQ ID 

NO: 88 (GAG*) and a potential reading frame of SEQ ID NO: 88 
in terms of amino acids; 

- Figure 3 7 shows the sequence homology between 
the clone FBdl3 and the HSERV-9 retrovirus; according to t .. 

15 this representation, the continuous line means a 
percentage homology greater than or equal to 70% and the 
absence of a line means a smaller percentage homology; 

- Figure 38 shows the nucleotide sequence SEQ ID 
NO: 61 of the clone FP6 and three potential reading frames . 

20 of SEQ ID NO: 61 in terms of amino acids; 

- Figure 3 9 shows the nucleotide sequence SEQ ID 
NO: 89 of the clone G+E+A and three potential reading 
frames of SEQ ID NO: 89 in terms of amino acids; 

- Figure 4 0 shows a reading frame found in the 
25 region E and coding for an MSRV-1 retroviral protease 

identified by SEQ ID NO: 90; 

- Figure 41 shows the response of each serum of 
patients suffering from MS, indicated by the symbol ( + ) , 
and of healthy patients, symbolised by (-) , tested with an 

30 anti-IgG antibody, expressed as net optical density at 
492 nm; 

- Figure 42 shows the response of each serum of 
patients suffering from MS, indicated by the symbols (+) 
and (QS) , and of healthy patients (-) , tested with an 

35 anti-IgM antibody, expressed as net optical density at 
492 nm; 
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- Figure 43 shows the RT-activity profile in 
sucrose density gradients of pellets from B-cell lines 
supernatants; Control B-cell line ■ was obtained from the 
relative of a patient with mitochondriopathy . MS B-Cell 

5 line □ was obtained from a patient with definite MS; 

- Figure 44 shows the nucleotide and amino acid 
alignment of the conserved pol regions of viruses detected 
in the study (cf Example 18) by the "Pan-retrovirus" PCR. 
"Deletions" are represented by dashes and standard single- 

10 letter abbreviations are used to designate amino acids and 
nucleotides (i = inosine) . The most highly conserved VLPQG 
and YXDD regions are shown as separate blocks in bold type 
at the end of each sequence. Amino acids which are present 
in all or in all but one of the sequences are underlined. 

15 PCR primers (modified from (12)) PAN-UO and PAN-UI are 
orientated 5* to 3' (sense) whereas primer PAN-DI is 3 1 to 
5 f (antisense) . Degeneracies are shown above (PAN-UO & 
PAN-DI) or below (PAN-UI) the PCR primer sequences. 
"X" denotes the nine base 5 1 extension cttggatcc , »-i» 

20 denotes the nine base 5 1 extension ctcaagctt . The capture 
and detector probes DpVl and CpVlb used in the ELOSA assay 
are shown below a representative MSRV-cpol sequence. At 
three positions below the translated MSRV-cpol sequence 
alternative amino acids (representing "non-silent" nucleic 

25 acid variations) are shown in italics - K and Y 
substitutions were only observed in PLI-1 derived clones 
whereas R and W were encoded by a significant proportion 
of the clones irrespective of derivation. Note that DpVl 
is peroxidase labelled and that CpVlb may be biotinylated 

30 at the 5 1 end if streptavidin coated plates are used. The 
name of each sequence is indicated at the left of the 
figure. 

HTLVi: Human Leukaemia Virus type 1; HIV1: Human 
Immunodeficiency Virus type 1; MoMLV: Moloney-Murine 
35 Leukaemia Virus; MPMV: Mason-Pfizer Monkey Virus. ERV9 : 
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Endogenous Retrovirus 9, MSRV-cpol: Multiple Sclerosis 
associated Retrovirus conserved pol region. 

- Figure 4 5 shows a phylogenic tree which is 
based on the conserved amino acid region encoded by the 

5 pol gene of MSRV and of representative endogenous and 
exogenous retroviruses and DNA viruses with reverse 
transcriptase. It was generated by the U.P.G.M.A. tree 
program of Geneworks® software. 

HSRV: Human Spumaretrovirus . EIAV: Equine Infectious 

10 Aenemia Virus. BLV: Bovine Leukaemia Virus. HIV1, HIV2 : 
Human Immunodeficiency Viruses type 1 and 2. HTLV1 and 
HTLV2 : Human Leukaemia Viruses type 1 and 2. F-MuLV: 
Friend-Murine Leukaemia Virus. MoMLV: Moloney-Murine 
Leukaemia Virus. BAEV: Baboon Endogenous Virus. GaLV/. 

15 Gibbon Ape Leukaemia Virus. HUMER41: Human Endogenous - 
Retroviral sequence, clone 41. IAP: Intracisternal A-type 
Particle. MPMV: Mason-Pfizer Monkey Virus. HERVK10: Human 
Endogenous Retrovirus K10. MMTV: Mouse Mammary tumour 
Virus. HSERV9 (ERV9 database sequence) : Human sequence of 

20 Endogenous Retrovirus 9. MSRV: Multiple Sclerosis ; 
associated Retrovirus. SIV: Simian Immunodeficiency Virus; 
RTLV-H: Reverse Transcriptase-Like Viral sequence H; SFV: 
Simian Foamy Virus; VISNA: Visna retrovirus; SIVl: Simian 
Immunodeficiency Virus type 1; SRV-2 : Simian Retrovirus 

25 type 2; SMRV-H: Squirrel Monkey Retrovirus H. 

- Figure 4 6 shows the MSRV sequence in the 
Protease and Reverse-Transcriptase regions of the pol 
gene. 

The aminoacid translation is aligned under the 
30 corresponding nucleotide sequence. The region 

corresponding to the Protease ORF cloned in a recombinant 
vector and expressed in E. coli , is boxed. The regions 
corresponding to the A and B fragments amplified on plasma 
samples from MS patients are indicated by brackets. The 
35 Reverse-Transcriptase (RT) and RNase H (RNH) region is 
boxed with dotted line. The highly conserved aminoacids 
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and/or active sites of enzyme activities of both PRT and 
RT (including RNH) are shown underlined. 

- Figure 47A illustrates the pecific detection 
of MSRV-pol RNA sequence by RT-PCR in the sucrose density 

5 fraction associated with RT-activity and in MS plasma ; 
Figure 47B shows the RT-activity profile on a sucrose 
density gradient obtained with extracellular virion 
pelleted from an MS choroid-plexus culture. The photograph 
below shows an agarose gel loaded with PGR products 

10 amplified from round 1 (ST1.1) RT-PCR products with the 
ST1.2 primer set. From left to right: water control 1 from 
RT-PCR step with ST1 . 1 set; water control 2 amplified from 
water control 1 with ST1.2 nested primers; Molecular 
weight markers; Fraction n°l to 10 corresponding to the 

15 RT-activity profile shown above; Plasma samples CI and C2 
from healthy blood donors. Plasma samples MSI and MS2 from 
two MS patients. 

- Figure 48 shows an example of a variant and/or 
recombined sequence in the region of the pol gene defined 

2 0 by homology with the overlapping regions described in 

Figure 25, as GM3 , MSRV-1 pol*, t pol and FBd3 . 

- Figure 49 shows the nucleotide (Figure 49A) 
and amino acid (Figure 49B) alignments of the pol region 
between clones 1, 5 and 8 of the same patient (Experiment 

25 46-7) . 

- Figure 50 shows the nucleotide (Figure 50A) 
and amino acid (Figure SOB) alignments of the pol region 
between clones 41, 4 3 and 4 2 of the same patient 
(Experiment 68-1) . 

30 - Figure 51 shows the nucleotide (Figure 51A) 

and amino acid (Figure 51B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 17 6) of clones 
1, 5 and 8 of the same patient (Experiment 46-7) and 
SEQ ID N0:1, and between their corresponding peptide 

3 5 sequences. 
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- Figure 52 shows the nucleotide (Figure 52A) 
and amino acid (Figure 52B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 169) of clones 
41, 43 and 42 of the same patient (Experiment 68-1) and 

5 SEQ ID NO:l, and between their corresponding peptide 
sequences . 

- Figure 53 shows the nucleotide (Figure 53A) 
and amino acid (Figure 53B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 176) of clones 

10 1, 5 and 8 of the same patient (Experiment 4 6-7) and the 
consensus sequence (SEQ ID NO: 169) of clones 41, 43 and 
42 of the same patient (Experiment 68-1) . 

Table 5 (at the end of the description) shows 
the sequences obtained by RT-PCR with degenerate pol 
15 primers on sucrose density gradient fractions containing 
the peak of RT-activity or its negative control (cf 
Example 18) ; and 

Table 6 (at the end of the description) shows 
the clinical data and results of MSRV-cpol detection by. 
20 "Pan-retro" PCR with specific ELOSA assay, on CSF from MS 
and control patients (cf Example 18). 

EXAMPLE 1: OBTAINING CLONES DESIGNATED MSRV-1B 
AND MSRV-2B, DEFINING, RESPECTIVELY/ A RETROVIRUS MSRV-1 
AND A COINFECTIVE AGENT MSRV2 , BY "NESTED" PCR AMPLIFICA- 
TION OF THE CONSERVED POL REGIONS OF RETROVIRUSES ON 
VIRION PREPARATIONS ORIGINATING FROM THE LM7PC AND PLI-2 
LINES 

A PCR technique derived from the technique 
published by Shih (12) was used- This technique enables 
all trace of contaminant DNA to be removed by treating all 
the components of the reaction medium with DNase. It 
concomitantly makes it possible, by the use of different 
but overlapping primers in two successive series of PCR 
amplification cycles, to increase the chances of amplify- 
ing a cDNA synthesized from an amount of RNA which is 
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small at the outset and further reduced in the sample by 
the spurious action of the DNAse on the RNA. In effect, 
the DNase is used under conditions of activity in excess 
which enable all trace of contaminant DNA to be removed 
5 before inactivation of this enzyme remaining in the sample 
by heating to 85°C for 10 minutes. This variant of the PCR 
technique described by Shih (12) was used on a cDNA 
synthesized from the nucleic acids of fractions of 
infective particles purified on a sucrose gradient 

10 according to the technique described by H. Perron (13) 
from the "POL-2" isolate (ECACC No. V92072202) produced by 
the PLI-2 line (ECACC No. 92072201) on the one hand, and 
from the MS7PG isolate (ECACC No. V93010816) produced by 
the LM7PC line (ECACC No. 93010817) on the other hand. 

15 These cultures were obtained according to the methods 
which formed the subject of the patent applications 
published under Nos WO 93/20188 and WO 93/20189. 

After cloning the products amplified by this 
technique with the TA Cloning Kit® and analysis of the 

20 sequence using an Applied Biosystems model 373A Automatic 
Sequencer, the sequences were analysed using the 
Geneworks® software on the latest available version of the 
Genebank® data bank. 

The sequences cloned and sequenced from these 

25 samples correspond, in particular, to two types of 
sequence: a first type of sequence, to be found in the 
majority of the clones (55% of the clones originating from 
the POL-2 isolates of the PLI-2 culture, and 67% of the 
clones originating from the MS7PG isolates of the LM7PC 

30 cultures) , which corresponds to a family of "pol" 
sequences closely similar to, but different from, the 
endogenous human retrovirus designated ERV-9 or HSERV-9 , 
and a second type of sequence which corresponds to 
sequences very strongly homologous to a sequence 

35 attributed to another infective and/or pathogenic agent 
designated MSRV-2 . 
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The first type of sequence, representing the 
majority of the clones, consists of sequences whose 
variability enables four subfamilies of sequences to be 
defined. These subfamilies are sufficiently similar to one 
5 another for it to be possible to consider them to be 
quasi-species originating from the same retrovirus, as is 
well known for the HIV-1 retrovirus (14), or to be the 
outcome of interference with several endogenous proviruses 
coregulated in the producing cells. These more or less 

10 defective endogenous elements are sensitive to the same 
regulatory signals possibly generated by a replicative 
provirus, since they belong to the same family of 
endogenous retroviruses (15). This new family of 
endogenous retroviruses, or alternatively this new- 

15 retroviral species from which the generation of quasi- 
species has been obtained in culture, and which contains a 
consensus of the sequences described below, is designated 
MSRV-1B. 

Figure 1 presents the general consensus 

20 sequences of the sequences of the different MSRV-1B clones 
sequenced in this experiment, these sequences being 
identified, respectively, by SEQ ID NO : 3 , SEQ ID NO: 4, SEQ 
ID NO: 5 and SEQ ID NO: 6. These sequences display a 
homology with respect to nucleic acids ranging from 70% to 

25 88% with the HSERV9 sequence referenced X57147 and M37638 
in the Genebank® data base. Four "consensus" nucleic acid 
sequences representative of different quasi-species of a 
possibly exogenous retrovirus MSRV-1B, or of different 
subfamilies of an endogenous retrovirus MSRV-1B, have been 

30 defined* These representative consensus sequences are 
presented in Figure 2, with the translation into amino 
acids. A functional reading frame exists for each 
subfamily of these MSRV-1B sequences, and it can be seen 
that the functional open reading frame corresponds in each 

35 instance to the amino acid sequence appearing on the 
second line under the nucleic acid sequence. The general 
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consensus of the MSRV-1B sequence, identified by SEQ ID 
NO: 7 and obtained by this PCR technique in the n pol" 
region, is presented in Figure 1. 

The second type of sequence representing the 
5 majority of the clones sequenced is represented by the 
sequence MSRV-2B presented in Figure 3 and identified by 
SEQ ID NO: 11. The differences observed in the sequences 
corresponding to the PCR primers are explained by the use 
of degenerate primers in mixture form used under different 

10 technical conditions. 

The MSRV-2B sequence (SEQ ID NO: 11) is suffic- 
iently divergent from the retroviral sequences already 
described in the data banks for it to be suggested that 
the sequence region in question belongs to a new infective 

15 agent, designated MSRV-2. This infective agent would, in 
principle, on the basis of the analysis of the first 
sequences obtained, be related to a retrovirus but, in 
view of the technique used for obtaining this sequence, it 
could also be a DNA virus whose genome codes for an enzyme 

20 which incidentally possesses reverse transcriptase 
activity, as is the case, for example, with the hepatitis 
B virus, HBV (12). Furthermore, the random nature of the 
degenerate primers used for this PCR amplification 
technique may very well have permitted, as a result of 

25 unforeseen sequence homologies or of conserved sites in 
the gene for a related enzyme, the amplification of a 
nucleic acid originating from a prokaryotic or eukaryotic 
pathogenic and/or coinfective agent (protist) . 

3 0 EXAMPLE 2: OBTAINING CLONES DESIGNATED MSRV-1B 

AND MSRV-2 B, DEFINING A FAMILY MSRV-1 and MSRV-2, BY 
"NESTED" PCR AMPLIFICATION OF THE CONSERVED POL REGIONS OF 
RETROVIRUSES ON PREPARATIONS OF B LYMPHOCYTES FROM A NEW 
CASE OF MS 

35 The same PCR technique, modified according to 

the technique of Shih (12), was used to amplify and 
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sequence the RNA nucleic acid material present in a 
purified fraction of virions at the peak of "LM7-like" 
reverse transcriptase activity on a sucrose gradient 
according to the technique described by H. Perron (13), 
5 and according to the protocols mentioned in Example 1, 
from a spontaneous lymphoblastoid line obtained by self- 
immortalization in culture of B lymphocytes from an MS 
patient who was seropositive for the Epstein-Barr virus 
(EBV) , after setting up the blood lymphoid cells in 

10 culture in a suitable culture medium containing a suitable 
concentration of cyclosporin A. A representation of the 
reverse transcriptase activity in the sucrose fractions 
taken from a purification gradient of the virions produced 
by this line is presented in Figure 4. Similarly, the 

15 culture supernatants of a B line obtained under the same 
conditions from a control free from MS were treated under 
the same conditions, and the assay of reverse 
transcriptase activity in the sucrose gradient fractions 
proved negative throughout (background) , and is presented 

20 in Figure 5. Fraction 3 of the gradient corresponding to* 
the MS B line and the same fraction without reverse 
transcriptase activity of the non-MS control gradient were 
analysed by the same RT-PCR technique as before, derived 
from Shih (12) , followed by the same steps of cloning and 

25 sequencing as described in Example 1. 

It is particularly noteworthy that the MSRV-1 
and MSRV-2 type sequences are to be found only in the 
• material associated with a peak of "LM7-like" reverse 
transcriptase activity originating from the MS B lympho- 

30 blastoid line. These sequences were not to be found with 
the material from the control (non-MS) B lymphoblastoid 
line in 26 recombinant clones taken at random. Only 
Mo-MuLV type contaminant sequences, originating from the 
commercial reverse transcriptase used for the cDNA 

35 synthesis step, and sequences without any particular 
retroviral analogy were to be found in this control, as a 
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result of the "consensus" amplification of homologous 
polymerase sequences which is produced by this PCR 
technique. Furthermore, the absence of a concentrated 
target which competes for the amplification reaction in 
5 the control sample permits the amplification of dilute 
contaminants. The difference in results is manifestly 
highly significant (chi-squared, p<0.001). 

EXAMPLE 3: OBTAINING A CLONE PSJ17, DEFINING A 
10 RETROVIRUS MSRV-1, BY REACTION OF ENDOGENOUS REVERSE 
TRANSCRIPTASE WITH A VIRION PREPARATION ORIGINATING FROM 
THE PLI-2 LINE 

This approach is directed towards obtaining 
reverse-transcribed DNA sequences from the supposedly 

15 retroviral RNA in the isolate using the reverse trans- 
criptase activity present in this same isolate. This 
reverse transcriptase activity can theoretically function 
only in the presence of a retroviral RNA linked to a 
primer tRNA or hybridized with short strands of DNA 

20 already reverse-transcribed in the retroviral particles 
(16). Thus, the obtaining of specific retroviral sequences 
in a material contaminated with cellular nucleic acids was 
optimized according to these authors by means of the 
specific enzymatic amplification of the portions of viral 

25 RNAs with a viral reverse transcriptase activity. To this 
end, the authors determined the particular physicochemical 
conditions under which this enzymatic activity of reverse 
transcription on RNAs contained in virions could be 
effective in vitro. These conditions correspond to the 

30 technical description of the protocols presented below 
(endogenous RT reaction, purification, cloning and 
sequencing) . 

The molecular approach consisted in using a 
preparation of concentrated but unpurified virion obtained 
35 from the culture supernatants of the PLI-2 line, prepared 
according to the following method: the culture 
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supernatants are collected twice weekly, precentr if uged at 
10,000 rpm for 30 minutes to remove cell debris and then 
frozen at -80°C or used as they are for the following 
steps. The fresh or thawed supernatants are centrifuged on 
5 a cushion of 30% glycerol-PBS at 100,000 g (or 30,000 rpm 
in a type 45 T LKB-HITACHI rotor) for 2 h at 4°C. After 
removal of the supernatant, the sedimented pellet is taken 
up in a small volume of PBS and constitutes the fraction 
of concentrated but unpurified virion. This concentrated 
10 but unpurified viral sample was used to perform a so- 
called endogenous reverse transcription reaction, as 
described below. 



to the protocol described above, and containing a reverse 

15 transcriptase activity of approximately 1-5 million dpm, 

is thawed at 37 °C until a liquid phase appears, and then 

placed on ice. A 5-fold concentrated buffer was prepared 

with the following components: 500 mM Tris-HCl pH 8.2; 
75 mM NaCl; 25 mM MgCl 2 ; 75 mM DTT and 0.10% NP 40; 100 ml 

20 of 5X buffer + 25 ml of a 100 mM solution of dATP + 25 ml 
of a 100 mM solution of dTTP + 25 ml of a 100 mM solution 
of dGTP + 25 ml of a 100 mM solution of dCTP + 100 ml of 
sterile distilled water + 200 ml of the virion suspension 
(RT activity of 5 million DPM) in PBS were mixed and 

25 incubated at 42 °C for 3 hours. After this incubation, the 
reaction mixture is added directly to a buffered 
phenol/chlorof orm/ isoamyl alcohol mixture (Sigma ref. 
P 3803); the aqueous phase is collected and one volume of 
sterile distilled water is added to the organic phase to 

30 re-extract the residual nucleic acid material. The 
collected aqueous phases are combined, and the nucleic 
acids contained are precipitated by adding 3M sodium 
acetate pH 5.2 to 1/10 volume + 2 volumes of ethanol + 
1 ml of glycogen (Boehringer-Mannheim ref. 901 393) and 

35 placing the sample at -20°C for 4 h or overnight at +4°C. 
The precipitate obtained after centr if ugation is then 



A volume of 200 ml of virion purified according 
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washed with 70% ethanol and resuspended in 60 ml of 
distilled water. The products of this reaction were then 
purified, cloned and sequenced according to the protocol 
which will now be described: blunt-ended DNAs with 
5 unpaired adenines at the ends were generated: a "filling- 
in" reaction was first performed: 25 ml of the previously 
purified DNA solution were mixed with 2 ml of a 2.5 mM 
solution containing, in equimolar amounts, dATP + dGTP + 
dTTP + dCTP/1 ml of T4 DNA polymerase (Boehr inger-Mannheim 

10 ref. 1004 786) / 5 ml of 10X "incubation buffer for 
restriction enzyme" (Boehr inger-Mannheim ref. 1417 975) / 
1 ml of a 1% bovine serum albumin solution / 16 ml of 
sterile distilled water. This mixture was incubated for 
20 minutes at 11°C. 50 ml of TE buffer and 1 ml of 

15 glycogen (Boehringer-Mannheim ref. 901 393) were added 
thereto before extraction of the nucleic acids with 
phenol/chlorof orm/ isoamyl alcohol (Sigma ref. P 3803) and 
precipitation with sodium acetate as described above. The 
DNA precipitated after centrif ugation is resuspended in 

20 10 ml of 10 mM Tris buffer pH 7,5. 5 ml of this suspension 
were then mixed with 20 ml of 5X Taq buffer, 20 ml of 5 mM 
dATP, 1 ml (5U) of Taq DNA polymerase (AmplitaqTM) and 
54 ml of sterile distilled water. This mixture is 
incubated for 2 h at 75°C with a film of oil on the 

25 surface of the solution. The DNA suspended in the aqueous 
solution drawn off under the film of oil after incubation 
is precipitated as described above and resuspended in 2 ml 
of sterile distilled water. The DNA obtained was inserted 
into a plasmid using the TA CloningTM kit. The 2 ml of DNA 

30 solution were mixed with 5 ml of sterile distilled water, 
1 ml of a 10-fold concentrated ligation buffer "10X 
LIGATION BUFFER", 2 ml of "pCR™ VECTOR" (25 ng/ml) and 
1 ml of "TA DNA LIGASE" . This mixture was incubated 
overnight at 12 °C. The following steps were carried out 

35 according to the instructions of the TA Cloning™ kit 
(British Biotechnology) . At the end of the procedure, the 
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white colonies of recombinant bacteria (white) were picked 
out in order to be cultured and to permit extraction of 
the plasmids incorporated according to the so-called 
"miniprep" procedure (17). The plasmid preparation from 
5 each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 
sequencing of the insert, after hybridization with a 

10 primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA cloning™ kit. The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 

15 (Applied Biosystems, ref . 401384) , and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 37 3 A" apparatus according to 
the manufacturer's instructions. 

Discriminating analysis on the computerized data 

20 banks of the sequences cloned from the DNA fragments 
present in the reaction mixture enabled a retroviral type 
sequence to be revealed. The corresponding clone PSJ17 was 
completely sequenced, and the sequence obtained, presented 
in Figure 6 and identified by SEQ ID NO: 9, was analysed 

25 using the "Geneworks®" software on the updated "Genebank™" 
data banks. An identical sequence already described could 
not be found by analysis of the data banks. Only a partial 
homology with some known retroviral elements was to be 
found. The most useful relative homology relates to an 

30 endogenous retrovirus designated ERV-9 , or HSERV-9, 
according to the references (18) . 
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EXAMPLE 4: PCR AMPLIFICATION OF THE NUCLEIC ACID 
SEQUENCE CONTAINED BETWEEN THE 5 1 REGION DEFINED BY THE 
CLONE "POL MSRV-1B" AND THE 3 1 REGION DEFINED BY THE CLONE 
PSJ17 

5 Five oligonucleotides, M001, M002-A, M003-BCD, 

P004 and POOS, were defined in order to amplify the RNA 
originating from purified POL-2 virions. Control reactions 
were performed so as to check for the presence of 
contaminants (reaction with water) . The amplification 
10 consists of an RT-PCR step according to the protocol 
described in Example 2, followed by a "nested" PCR 
according to the PCR protocol described in the document 
EP-A-0 , 569 , 272 . In the first RT-PCR cycle, the primers 
M001 and P004 or POOS are used* In the second PCR cycle, 
15 the primers M002-A or M003-BCD and the primer P004 are 
used. The primers are positioned as follows: 
M002-A 
M003-BCD 

M001 P004 POOS 



20 



POL-2 

< > < 

pol MSRV-1B PSJ17 



RNA 



25 

Their composition is : 
primer M001: GGTCITICCICAIGG (SEQ ID NO: 20) 
primer M002-A: TTAGGGATAGCCCTCATCTCT (SEQ ID NO: 21) 
primer M003-BCD: TCAGGGATAGCCCCCATCTAT (SEQ ID NO: 22) 
3 0 primer P004: AACCCTTTGCCACTACATCAATTT (SEQ ID NO: 23) 
primer POOS: GCGTAAGGACTCCTAGAGCTATT (SEQ ID NO: 24) 

The "nested" amplification product obtained, and 
designated M003-P004, is presented in Figure 7, and 
corresponds to the sequence SEQ ID NO: 8. 



35 



WO 98/23755 



PCT/IB97/01482 



EXAMPLE 5: AMPLIFICATION AND CLONING OF A 
PORTION OF THE MSRV-1 RETROVIRAL GENOME USING A SEQUENCE 
ALREADY IDENTIFIED , IN A SAMPLE OF VIRUS PURIFIED AT THE 
PEAK OF REVERSE TRANSCRIPTASE ACTIVITY 

5 A PGR technique derived from the technique 

published by Frohman (19) was used. The technique derived 
makes it possible, using a specific primer at the 3 1 end 
of the genome to be amplified, to elongate the sequence 
towards the 5 1 region of the genome to be analysed. This 
10 technical variant is described in the documentation of the 
firm "Clontech Laboratories Inc.", (Palo-Alto California, 
USA) supplied with its product "5 1 -AmpliFINDERTM RACE 
Kit", which was used on a fraction of virion purified as 
described above . 

15 The specific 3 • primers used in the kit protocol; 

for the synthesis of the cDNA and the PCR amplification 
are, respectively, complementary to the following MSRV-1 
sequences: 

cDNA : TCATCCATGTACCGAAGG (SEQ ID NO; 25) 

20 amplification : ATGGGGTTCCCAAGTTCCCT (SEQ ID NO: 26) 

The products originating from the PCR were 
obtained after purification on agarose gel according to 
conventional methods (17) , and then resuspended in 10 ml 

25 of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3 1 end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA CloningTM kit 
(British Biotechnology) . The 2 ml of DNA solution were 

30 mixed with 5 ml of sterile distilled water, 1 ml of a 10- 
fold concentrated ligation buffer "10X LIGATION BUFFER", 
2 ml of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA 
LIGASE" . This mixture was incubated overnight at 12 °C. The 
following steps were carried out according to the 

35 instructions of the TA Cloning™ kit (British Bio- 
technology) . At the end of the procedure, the white 
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colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called "mini- 
prep" procedure (17). The plasmid preparation from each 
5 recombinant colony was cut with a suitable restriction 
enzyme and analysed on agarose gel. Plasmids possessing an 
insert detected under UV light after staining the gel with 
ethidium bromide were selected for sequencing of the 
insert, after hybridization with a primer complementary to 

10 the Sp6 promoter present on the cloning plasmid of the TA 
Cloning™ Kit. The reaction prior to sequencing was then 
performed according to the method recommended for the use 
of the sequencing kit "Prism ready reaction kit dye 
deoxyterminator cycle sequencing kit" (Applied Biosystems, 

15 ref . 401384), and automatic sequencing was carried out 
with an Applied Biosystems "Automatic Sequencer model 
373 A" apparatus according to the manufacturer's 
instructions . 

This technique was applied first to two 

20 fractions of virion purified as described below on sucrose 
from the "POL-2" isolate produced by the PLI-2 line on the 
one hand, and from the MS7PG isolate produced by the LM7PC 
line on the other hand. The culture supernatants are 
collected twice weekly, precentrif uged at 10,000 rpm for 

25 30 minutes to remove cell debris and then frozen at -80°C 
or used as they are for the following steps . The fresh or 
thawed supernatants are centrifuged on a cushion of 3 0% 
glycerol-PBS at 100,000 g (or 30,000 rpm in a type 45 T 
LKB-HITACHI rotor) for 2 h at 4°C. After removal of the 

30 supernatant, the sedimented pellet is taken up in a small 
volume of PBS and constitutes the fraction of concentrated 
but unpurified virions. The concentrated virus is then 
applied to a sucrose gradient in sterile PBS buffer (15 to 
50% weight/weight) and ultracentr if uged at 35,000 rpm 

35 (100,000 g) for 12 h at +4°C in a swing-out rotor. 
10 fractions are collected, and 2 0 ml are withdrawn from 



WO 98/23755 PCT7IB97/01482 



each fraction after homogenization to assay the reverse 
transcriptase activity therein according to the technique 
described by H. Perron (3) « The fractions containing the 
peak of "LMV-like" RT activity are then diluted in sterile 
5 PBS buffer and ultracentrif uged for one hour at 35,000 rpm 
(100,000 g) to sediment the viral particles. The pellet of 
purified virion thereby obtained is then taken up in a 
small volume of a buffer which is appropriate for the 
extraction of RNA. The cDNA synthesis reaction mentioned 

10 above is carried out on this RNA extracted from purified 
extracellular virion. PCR amplification according to the 
technique mentioned above enabled the clone Fl-11 to be 
obtained, whose sequence, identified by SEQ ID NO: 2, is 
presented in Figure 8 . 

15 This clone makes it possible to define, with the 

different clones previously sequenced, a region of 
considerable length (1.2 kb) representative of the "pol". 
gene of the MSRV-1 retrovirus, as presented in Figure 9. 
This sequence, designated SEQ ID NO:l, is reconstituted 

20 from different clones overlapping one another at their 
ends, correcting the artefacts associated with the primers 
and with the amplification or cloning techniques which 
would artificially interrupt the reading frame of the 
whole. This sequence will be identified below under the 

25 designation "MSRV-1 pol* region". Its degree of homology 
with the HSERV-9 sequence is shown in Figure 12. 

In Figure 9, the potential reading frame with 
its translation into amino acids is presented below the 
nucleic acid sequence . 

30 

EXAMPLE 6: DETECTION OF SPECIFIC MSRV-1 and 
MSRV-2 SEQUENCES IN DIFFERENT SAMPLES OF PLASMA 
ORIGINATING FROM PATIENTS SUFFERING FROM MS OR FROM 
CONTROLS 

3 5 A PCR technique was used to detect the MSRV-1 

and MSRV-2 genomes in plasmas obtained after taking blood 
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samples from patients suffering from MS and from non-MS 
controls onto EDTA . 

Extraction of the RNAs from plasma was performed 
according to the technique described by P. Chomzynski 
5 (20) , after adding one volume of buffer containing 
guanidinium thiocyanate to 1 ml of plasma stored frozen at 
-80°C after collection. 

For MSRV-2 , the PCR was performed under the same 
conditions and with the following primers: 
10 - 5 1 primer, identified by SEQ ID NO: 14 

5 1 GTAGTTCGATGTAGAAAGCG 3 1 ; 

- 3' primer, identified by SEQ ID NO: 15 
5 • GCATCCGGCAACTGCACG 3 1 . 

However, similar results were also obtained with 
15 the following PCR primers in two successive amplifications 
by "nested" PCR on samples of nucleic acids not treated 
with DNase. 

The primers used for this first step of 
40 cycles with a hybridization temperature of 48 °C are the 
20 following: 

- 5 1 primer, identified by SEQ ID NO: 27 

5 1 GCCGATATCACCCGCCATGG 3 1 , corresponding to a 
5 f MSRV-2 PCR primer, for a first PCR on samples from 
patients, 

25 - 3' primer, identified by SEQ ID NO: 28 

5 1 GCATCCGGCAACTGCACG 3 1 , corresponding to a 3 1 
MSRV-2 PCR primer, for a first PCR on samples from 
patients . 

After this step, 10 ml of the amplification 
3 0 product are taken and used to carry out a second, 
so-called "nested" PCR amplification with primers located 
within the region already amplified. This second step 
takes place over 35 cycles, with a primer hybridization 
("annealing") temperature of 50°C. The reaction volume is 
35 100 ml. 
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The primers used for this second step are the 

following : 

- 5 1 primer, identified by SEQ ID NO: 29 

5 ' CGCGATGCTGGTTGGAGAGC 3 * , corresponding to a 
5 5' MSRV-2 PGR primer, for a nested PCR on samples from 
patients , 

- 3' primer, identified by SEQ ID NO: 30 

5 • TCTCCACTCCGAATATTCCG 3 1 , corresponding to a 
3' MSRV-2 PCR primer, for a nested PCR on samples from 
10 patients. 

For MSRV-1, the amplification was performed in 
two steps. Furthermore, the nucleic acid sample is treated 
beforehand with DNase, and a control PCR without RT (AMV 
reverse transcriptase) is performed on the two 

15 amplification steps so as to verify that the RT-PCR 
amplification comes exclusively from the MSRV-1 RNA . In 
the event of a positive control without RT, the initial 
aliquot sample of RNA is again treated with DNase and 
amplified again. 

20 The protocol for treatment with DNase lacking 

RNAse activity is as follows: the extracted RNA is 
aliquoted in the presence of "RNAse inhibitor" 
(Boehringer-Mannheim) in water treated with DEPC at a 
final concentration of 1 mg in 10 ml; to these 10 ml, 1 ml 

25 of "RNAse-free DNAse" (Boehringer-Mannheim) and 1.2 ml of 
pH 5 buffer containing 0.1 M/l sodium acetate and 5 mM/1 
MgS0 4 is added; the mixture is incubated for 15 min at 

20°C and brought to 95 °C for 1.5 min in a " thermocycler" . 

The first MSRV-1 RT-PCR step is performed 

30 according to a variant of the RNA amplification method as 
described in Patent Application No. EP-A-0 , 569 , 272 . In 
particular, the cDNA synthesis step is performed at 42 °C 
for one hour; the PCR amplification takes place over 
40 cycles, with a primer hybridization ("annealing") 

35 temperature of 53 °C. The reaction volume is 100 ml. 
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The primers used for this first step are the 

following : 



- 5' primer, identified by SEQ ID NO: 16 
5 ' AGGAGTAAGGAAACCCAACGGAC 3 ■ ; 

- 3' primer, identified by SEQ ID NO: 17 
5 • TAAGAGTTGCACAAGTGCG 3 • . 

After this step, 10 ml of the amplification 



product are taken and used to carry out a second, so- 
called "nested" PCR amplification with primers located 
within the region already amplified. This second step 
takes place over 35 cycles, with a primer hybridization 
("annealing") temperature of 53°C. The reaction volume is 
100 ml. 

The primers used for this second step are the 

following : 



- 5' primer, identified by SEQ ID NO: 18 
5 • TCAGGGATAGCCCCCATCTAT 3 1 ; 

- 3' primer, identified by SEQ ID NO: 19 
5 ■ AACCCTTTGCCACTACATCAATTT 3 ' . 

Figures 10 and 11 present the results of PCR in 



the form of photographs under ultraviolet light of 
ethidium bromide-impregnated agarose gels, in which an 
electrophoresis of the PCR amplification products applied 
separately to the different wells was performed. 

The top photograph (Figure 10) shows the result 
of specific MSRV-2 amplification. 

Well number 8 contains a mixture of DNA 
molecular weight markers, and wells 1 to 7 represent, in 
order, the products amplified from the total RNAs of 
plasmas originating from 4 healthy controls free from MS 
(wells l to 4) and from 3 patients suffering from MS at 
different stages of the disease (wells 5 to 7). 



detected in the plasma of one case of MS out of the 3 
tested, and in none of the 4 control plasmas. Other 



In this series, MSRV-2 nucleic acid material is 
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results obtained on more extensive series confirm these 
results . 

The bottom photograph (Figure 11) shows the 
result of specific amplification by MSRV-1 "nested" 
5 RT-PCR: 

well No. 1 contains the PCR product produced 
with water alone, without the addition of AMV reverse 
transcriptase; well No. 2 contains the PCR product 
produced with water alone, with the addition of AMV 

10 reverse transcriptase; well number 3 contains a mixture of 
DNA molecular weight markers; wells 4 to 13 contain, in 
order, the products amplified from the total RNAs 
extracted from sucrose gradient fractions (collected in a 
downward direction) , on which gradient a pellet of virion 

15 originating from a supernatant of a culture infected with 
MSRV-1 and MSRV-2 was centrifuged to equilibrium according 
to the protocol described by H. Perron (13) ; to well 14 
nothing was applied; to wells 15 to 17, the amplified 
products of RNA extracted from plasmas originating from 3 

20 different patients suffering from MS at different stages 
of the disease were applied. 

The MSRV-1 retroviral genome is indeed to be 
found in the sucrose gradient fraction containing the peak 
of reverse transcriptase activity measured according to 

25 the technique described by H. Perron (3), with a very 
strong intensity (fraction 5 of the gradient, placed in 
well No. 8) • A slight amplification has taken place in the 
first fraction (well No. 4), probably corresponding to RNA 
released by lysed particles which floated at the surface 

30 of the gradient; similarly, aggregated debris has 
sedimented in the last fraction (tube bottom) , carrying 
with it a few copies of the MSRV-1 genome which have given 
rise to an amplification of low intensity. 

Of the 3 MS plasmas tested in this series, MSRV- 

35 1 RNA turned up in one case, producing a very intense 
amplification (well No. 17) . 
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In this series, the MSRV-1 retroviral RNA 
genome, probably corresponding to particles of 
extracellular virus present in the plasma in extremely 
small numbers, was detected by "nested" RT-PCR in one case 
5 of MS out of the 3 tested. Other results obtained on more 
extensive series confirm these results. 

Furthermore, the specificity of the sequences 
amplified by these PCR techniques may be verified and 
evaluated by the "ELOSA" technique as described by 
10 F. Mallet (21) and in the document FR-A-2 , 663 , 040 . 

For MSRV-1, the products of the nested PCR 
described above may be tested in two ELOSA systems 
enabling a consensus A and a consensus B+c+D of MSRV-1 to 
be detected separately, corresponding to the subfamilies 
15 described in Example 1 and Figures l and 2. In effect, the 
sequences closely resembling the consensus B+c+D are to be 
found essentially in the RNA samples originating from 
MSRV-l virions purified from cultures or amplified in 
extracellular biological fluids of MS patients, whereas 
20 the sequences closely resembling the consensus A are 
essentially to be found in normal human cellular DNA . 

The ELOSA/MSRV-1 system for the capture and 
specific hybridization of the PCR products of the 
subfamily A uses a capture oligonucleotide cpVIA with an 
25 amine bond at the 5 1 end and a biotinylated detection 
oligonucleotide dpVIA having as their sequence, 
respectively : 

- cpVIA identified by SEQ ID NO: 31 

5 • GATCTAGGCCACTTCTCAGGTCCAGS 3 1 , corresponding 
30 to the ELOSA capture oligonucleotide for the products of 
MSRV-1 nested PCR performed with the primers identified by 
SEQ ID NO: 16 and SEQ ID NO: 17, optionally followed by 
amplification with the primers identified by SEQ ID NO: 18 
and SEQ ID NO: 19 on samples from patients; 
35 - dpVIA identified by SEQ ID NO: 32; 



# 
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5 1 CATCTITTTGGICAGGCAITAGC 3 1 , corresponding to 
the ELOSA capture oligonucleotide for the subfamily A of 
the products of MSRV-1 "nested" PCR performed with the 
primers identified by SEQ ID NO: 16 and SEQ ID NO: 17, 
5 optionally followed by amplification with the primers 
identified by SEQ ID NO: 18 and SEQ ID NO: 19 on samples 
from patients. 

The ELOSA/MSRV-1 system for the capture and 
specific hybridization of the PCR products of the 
10 subfamily B+C+D uses the same biotinylated detection 
oligonucleotide dpVlA and a capture oligonucleotide cpVlB 
with an amine bond at the 5* end having as its sequence: 

- dpVlB identified by SEQ ID NO: 33 

5 1 CTTGAGCCAGTTCTCATACCTGGA 3 • , corresponding to 
15 the ELOSA capture oligonucleotide for the subfamily B + C 
+ D of the products of MSRV-1 "nested" PCR performed with 
the primers identified by SEQ ID NO: 16 and SEQ ID NO: 17, 
optionally followed by amplification with the primers 
identified by SEQ ID NO: 18 and SEQ ID NO: 19 on samples 
20 from patients. 

This ELOSA detection system enabled it to be 
verified that none of the PCR products thus amplified from 
DNase-treated plasmas of MS patients contained a sequence 
of the subfamily A, and that all were positive with the 
25 consensus of the subfamilies B, C and D. 

For MSRV-2, a similar ELOSA technique was evalu- 
ated on isolates originating from infected cell cultures, 
using the following PCR amplification primers, 

- 5 1 primer, identified by SEQ ID NO: 34 

3 0 5* AGTGYTRCCMCARGGCGCTGAA 3 1 , corresponding to a 

5' MSRV-2 PCR primer, for PCR on samples from cultures, 

- 3 1 primer, identified by SEQ ID NO: 35 

5 1 GMGGCCAGCAGSAKGTCATCCA 3 ' , corresponding to a 
3' MSRV-2 PCR primer, for PCR on samples from cultures, 
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and the capture oligonucleotides with an amine 
bond at the 5» end cpV2 and the biotinylated detection 
oligonucleotide dpV2 having as their respective sequences: 
- cpV2 identified by SEQ ID NO: 36 
5 5 GGATGCCGCCTATAGCCTCTAC 3 1 , corresponding to an 

ELOSA capture oligonucleotide for the products of MSRV-2 
PCR performed with the primers SEQ ID NO: 34 and SEQ ID 
NO: 35, or optionally with the degenerate primers defined 
by Shih (12) • 
10 - dpV2 identified by SEQ ID NO: 37 

5 ■ AAGCCTATCGCGTGCAGTTGCC 3 1 , corresponding to 
an ELOSA detection oligonucleotide for the products of 
MSRV-2 PCR performed with the primers SEQ ID NO: 34 and SEQ 
ID NO: 35, or optionally with the degenerate primers 
15 defined by Shih (12) 

This PCR amplification system with a pair of 
primers different from those which were described previ- 
ously for amplification on the samples from patients made 
it possible to confirm the infection with MSRV-2 of in 
20 vitro cultures and of samples of nucleic acids used for 
the molecular biology studies. 

All things considered, the first results of PCR 
detection of the genome of pathogenic and/or infective 
agents show that it is possible that free "virus" may 
25 circulate in the blood stream of patients in an acute, 
virulent phase, outside the nervous system. This is 
compatible with the almost invariable presence of "gaps" 
in the blood-brain barrier of patients in an active phase 
of MS. 

30 

EXAMPLE 7: OBTAINING SEQUENCES OF THE "onv" GENE 
OF THE MSRV-1 RETROVIRAL GENOME 

As has already been described in Example 5, a 
PCR technique derived from the technique published by 
35 Frohman (19) was used. The technique derived makes it 
possible, using a specific primer at the 3 1 end of the 
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genome to be amplified, to elongate the sequence towards 
the 5 1 region of the genome to be analysed . This technical 
variant is described in the documentation of "Clontech 
Laboratories Inc., (Palo-Alto California, USA) supplied 
5 with its product M 5 ' -AmpliFINDER™ RACE Kit", which was 
used on a fraction of virion purified as described above. 

In order to carry out an amplification of the 3 • 
region of the MSRV-1 retroviral genome encompassing the 
region of the "env" gene, a study was carried out to 

10 determine a consensus sequence in the LTR regions of the 
same type as those of the defective endogenous retrovirus 
HSERV-9 (18, 24), with which the MSRV-1 retrovirus 
displays partial homologies. 

The same specific 3 1 primer was used in the kit 

15 protocol for the synthesis of the cDNA and the PCR 
amplification; its sequence is as follows: 

GTGCTGATTGGTGTATTTACAATCC (SEQ ID NO 45) 
Synthesis of the complementary DNA (cDNA) and 
unidirectional PCR amplification with the above primer 

2 0 were carried out in one step according to the method 
described in Patent EP-A-0 , 569 , 272 . 

The products originating from the PCR were 
extracted after purification of agarose gel according to 
conventional methods (17) , and then resuspended in 10 ml 

25 of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3' end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 
Biotechnology) . The 2 ml of DNA solution were mixed with 5 

30 ml of sterile distilled water, 1 ml of a 10-fold 
concentrated ligation buffer "10X LIGATION BUFFER", 2 ml 
of "pCR™ VECTOR 11 (25 ng/ml) and 1 ml of "TA DNA LIGASE" . 
This mixture was incubated overnight at 12 °C. The 
following steps were carried out according to the 

35 instructions of the TA Cloning® kit (British Biotechno- 
logy) . At the end of the procedure, the white colonies of 



WO 98/23755 



PCT/IB97/01482 



recombinant bacteria (white) were picked out in order to 
be cultured and to permit extraction of the plasmids 
incorporated according to the so-called "miniprep" 
procedure (17) . The plasmid preparation from each 
5 recombinant colony was cut with a suitable restriction 
enzyme and analysed on agarose gel. Plasmids possessing an 
insert detected under UV light after staining the gel with 
ethidium bromide were selected for sequencing of the 
insert, after hybridization with a primer complementary to 

10 the Sp6 promoter present on the cloning plasmid of the TA 
Cloning™ Kit. The reaction prior to sequencing was then 
performed according to the method recommended for the use 
of the sequencing kit "Prism ready reaction kit dye 
deoxyterminator cycle sequencing kit" (Applied Biosystems, 

15 ref. 401384), and automatic sequencing was carried out 
with an Applied Biosystems "automatic sequencer, model 
373 A" apparatus according to the manufacturer's 
instructions . 

This technical approach was applied to a sample 

20 of virion concentrated as described below from a mixture 
of culture supernatants produced by B lymphoblastoid lines 
such as are described in Example 2, established from 
lymphocytes of patients suffering from MS and possessing 
reverse transcriptase activity which is detectable 

25 according to the technique described by Perron et al. (3): 
the culture supernatants are collected twice weekly, 
precentr if uged at 10,000 rpm for 30 minutes to remove cell 
debris and then frozen at -80 °C or used as they are for 
the following steps. The fresh or thawed supernatants are 

30 centrifuged on a cushion of 30% glycerol-PBS at 100,000 g 
for 2 h at 4°C. After removal of the supernatant, the 
sedimented pellet constitutes the sample of concentrated 
but unpurified virions. The pellet thereby obtained is 
then taken up in a small volume of an appropriate buffer 

3 5 for the extraction of RNA. The cDNA synthesis reaction 
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mentioned above is carried out on this RNA extracted from 
concentrated extracellular virion. 

RT-PCR amplification according to the technique 
mentioned above enabled the clone FBd3 to be obtained, 
whose sequence, identified by SEQ ID NO: 46, is presented 
in Figure 13. 

In Figure 14, the sequence homology between the 
clone FBd3 and the HSERV-9 retrovirus is shown on the 
matrix chart by a continuous line for any partial homology 
greater than or equal to 65%. It can be seen that there 
are homologies in the flanking regions of the clone (with 
the pol gene at the 5 1 end and with the env gene and then 
the LTR at the 3 1 end) , but that the internal region is 
totally divergent and does not display any homology, even 
weak, with the "env" gene of HSERV9 . Furthermore, it is 
apparent that the clone FBd3 contains a longer "env" 
region than the one which is described for the defective 
endogenous HSERV-9 ; it may thus be seen that the internal 
divergent region constitutes an "insert" between the 
regions of partial homology with the HSERV-9 defective 
genes • 

EXAMPLE 8: AMPLIFICATION, CLONING AND SEQUENCING 
OF THE REGION OF THE MSRV-1 RETROVIRAL GENOME LOCATED 
25 BETWEEN THE CLONES PSJ17 AND FBd3 

Four oligonucleotides, Fl, B4 , F6 and Bl, were 
defined for amplifying RNA originating from concentrated 
virions of the strains POL2 and MS7PG. Control reactions 
were performed so as to check for the presence of 

30 contaminants (reaction with water) - The amplification 
consists of a first step of RT-PCR according to the 
protocol described in Patent Application EP-A-0 , 569 , 272 , 
followed by a second step of PCR performed on 10 ml of 
product of the first step with primers internal to the 

35 amplified first region ("nested" PCR) . In the first RT-PCR 
cycle, the primers Fl and B4 are used. In the second PCR 



10 



15 
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cycle, the primers F6 and the primer Bl are used. The 
primers are positioned as follows: 

Fl F6 Bl B4 



MSRV-1 



RNA 



PSJ17 



10 5' pol MSRV-1 
5 • env 



FBd3 



3 1 pol MSRV-1 



Their composition is: 

primer Fl: TGATGTGAACGGCATACTCACTG (SEQ ID NO: 47) 
15 primer B4 : CCCAGAGGTTAGGAACTCCCTTTC (SEQ ID NO 48) 

primer F6: GCTAAAGGAGACTTGTGGTTGTCAG (SEQ ID NO 49) 

primer Bl: CAACATGGGCATTTCGGATTAG (SEQ ID NO 50) 

The product of "nested" amplification obtained 

and designated "t pol" is presented in Figure 15 , and 
20 corresponds to the sequence SEQ ID NO: 51. 



EXAMPLE 9: OBTAINING NEW SEQUENCES , EXPRESSED AS 
RNA IN CELLS IN CULTURE PRODUCING MSRV-1 , AND COMPRISING 
AN "env" REGION OF THE MSRV-1 RETROVIRAL GENOME 

25 A library of cDNA was produced according to the 

procedure described by the manufacturer of the "cDNA 
synthesis module, cDNA rapid adaptator ligation module, 
cDNA rapid cloning module and lambda gtlO in vitro 
packaging module" kits (Amersham, ref RPN1256Y/Z, RPN1712, 

30 RPN1713, RPN1717, N334Z), from the messenger RNA extracted 
from cells of a B lymphoblastoid line such as is described 
in Example 2, established from the lymphocytes of a 
patient suffering from MS and possessing reverse 
transcriptase activity which is detectable according to 

35 the technique described by Perron et al. (3). 
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Oligonucleotides were defined for amplifying the 
cDNA cloned into the nucleic acid library between the 3* 
region of the clone PSJ17 (pol) and the 5'(LTR) region of 
the clone FBd3 . Control reactions were performed so as to 
5 check for the presence of contaminants (reaction with 
water) . PCR reactions performed on the nucleic acids 
cloned into the library with different pairs of primers 
enabled a series of clones linking pol sequences to the 
MSRV-1 type env or LTR sequences to be amplified. 
10 Two clones are representative of the sequences 

obtained in the cellular cDNA library: 

- the clone JLBcl, whose sequence SEQ ID NO: 52 is pre- 
sented in Figure 16; 

- the clone JLBc2 , whose sequence SEQ ID NO: 53 is pre- 
15 sented in Figure 17. 

The sequences of the clones JLBcl and JLBc2 are 
homologous to that of the clone FBd3 , as is apparent in 
Figures 18 and 19. The homology between the clone JLBcl 
and the clone JLBc2 is shown in Figure 20. 

20 The homologies between the clones JLBcl and 

JLBc2 on the one hand and the HSERV9 sequence on the other 
hand are presented, respectively, in Figures 21 and 22. 

It will be noted that the region of homology 
between JLB1, JLB2 and FBd3 comprises, with a few sequence 

25 and size variations of the "insert", the additional 
sequence absent ("inserted") in the HSERV-9 env sequence, 
as described in Example 8. 

It will also be noted that the cloned "pol" 
region is very homologous to HSERV-9, does not possess a 

3 0 reading frame (bearing in mind the sequence errors induced 
by the techniques used, including even the automatic 
sequencer) and diverges from the MSRV-l sequences obtained 
from virions. In view of the fact that these sequences 
were cloned from the RNA of cells expressing MSRV-1 

35 particles, it is probable that they originate from 
endogenous retroviral elements related to the ERV9 family; 
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this is all the more likely for the fact that the pol and 
env genes are present on the same RNA which is clearly not 
the MSRV-1 genomic RNA . Some of these ERV9 elements 
possess functional LTRs which can be activated by 
5 replicative viruses coding for homologous or heterologous 
transactivators . Under these conditions, the relationship 
between MSRV-1 and HSERV-9 makes probable the 
transactivation of the defective (or otherwise) endogenous 
ERV9 elements by homologous, or even identical, MSRV-1 

10 transactivating proteins. 

Such a phenomenon may induce a viral interfer- 
ence between the expression of MSRV-1 and the related 
endogenous elements . Such an interference generally leads 
to a so-called "defective-interfering" expression, some 

15 features of which were to be found in the MSRV-1 -infected 
cultures studied . Furthermore , such a phenomenon does not 
lack generation of the expression of polypeptides, or even 
of endogenous retroviral proteins which are not 
necessarily tolerated by the immune system. Such a scheme 

20 of aberrant expression of endogenous elements related to 
MSRV-1 and induced by the latter is liable to multiply the 
aberrant antigens, and hence to contribute to the 
induction of autoimmune processes such as are observed in 
MS. 

25 It is, however, essential to note that the 

clones JLBcl and JLBc2 differ from the ERV9 or HSERV9 
sequence already described , in that they possess a longer 
env region comprising an additional region totally 
divergent from ERV9 . Their kinship with the endogenous 

3 0 ERV9 family may hence be defined, but they clearly 
constitute novel elements never hitherto described. In 
effect, interrogation of the data banks of nucleic acid 
sequences available in version No. 15 (1995) of the 
"Entrez" software (NCBI, NIH, Bethesda, USA) did not 

35 enable a known homologous sequence in the env region of 
these clones to be identified. 
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EXAMPLE 10: OBTAINING SEQUENCES LOCATED IN THE 
5" pol AND 3» gag REGION OF THE MSRV-1 RETROVIRAL GENOME 

As has already been described in Example 5, a 
5 PCR technique derived from the technique published by 
Frohman (19) was used. The technique derived makes it 
possible, using a specific primer at the 3' end of the 
genome to be amplified, to elongate the sequence towards 
the 5 1 region of the genome to be analysed. This technical 

10 variant is described in the documentation of the firm 
Clontech Laboratories Inc. , (Palo-Alto California, USA) 
supplied with its product "5 1 -Amp li FINDER™ RACE Kit", 
which was used on a fraction of virion purified as 
described above. 

15 In order to carry out an amplification of the 5 1 

region of the MSRV-1 retroviral genome starting from the 
pol sequence already sequenced (clone Fll-1) and extending 
towards the gag gene, MSRV-1 specific primers were 
defined. 

20 The specific 3 1 primers used in the kit protocol . 

for the synthesis of the cDNA and the PCR amplification . 
are, respectively, complementary to the following MSRV-1 
sequences : 

cDNA: (SEQ ID NO: 54) 

2 5 CCTGAGTTCTTGCACTAACCC 

amplification: (SEQ ID NO: 55) 
GTCCGTTGGGTTTCCTTACTCCT 

The products originating from the PCR were 
extracted after purification on agarose gel according to 

30 conventional methods (17) , and then resuspended in 10 ml 
of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3 ' end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 

35 Biotechnology) . The 2 ml of DNA solution were mixed with 5 
ml of sterile distilled water, 1 ml of a 10-fold 
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concentrated ligation buffer "10X LIGATION BUFFER", 2 ml 
of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA LIGASE". 
This mixture was incubated overnight at 12 °C. The 
following steps were carried out according to the 
5 instructions of the TA Cloning® kit (British 
Biotechnology) . At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 

10 "miniprep" procedure (17) . The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 

15 sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning™ Kit. The reaction prior 
to sequencing was then performed according to the method 
recommended -for the use of the sequencing kit "Prism ready 

20 reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"automatic sequencer model 373 A" apparatus according to 
the manufacturer's instructions. 

25 This technical approach was applied to a sample 

of virion concentrated as described below from a mixture 
of culture supernatants produced by B lymphoblastoid lines 
such as are described in Example 2, established from 
lymphocytes of patients suffering from MS and possessing 

30 reverse transcriptase activity which is detectable 
according to the technique described by Perron et al. (3): 
the culture supernatants are collected twice weekly, 
precentrif uged at 10,000 rpm for 30 minutes to remove cell 
debris and then frozen at -80°C or used as they are for 

35 the following steps. The fresh or thawed supernatants are 
centrifuged on a cushion of 30% glycerol-PBS at 100,000 g 
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for 2 h at 4°C. After removal of the supernatant, the 
sedimented pellet constitutes the sample of concentrated 
but unpurified virions. The pellet thereby obtained is 
then taken up in a small volume of an appropriate buffer 
5 for the extraction of RNA. The cDNA synthesis reaction 
mentioned above is carried out on this RNA extracted from 
concentrated extracellular virion. 

RT-PCR amplification according to the technique 
mentioned above enabled the clone GM3 to be obtained, 

10 whose sequence, identified by SEQ ID NO 56, is presented 
in Figure 23. 

In Figure 24, the sequence homology between the 
clone GMP3 and the HSERV-9 retrovirus is shown on the 
matrix chart by a continuous line, for any partial^ 

15 homology greater than or equal to 65%. 

In summary, Figure 2 5 shows the localization of. 
the different clones studied above, relative to the known. 
ERV9 genome. In Figure 25, since the MSRV-1 env region is 
longer than the reference ERV9 env gene, the additional 

20 region is shown above the point of insertion according to 
a "V", on the understanding that the inserted material 
displays a sequence and size vari-ability between the 
clones shown (JLBcl, JLBc2, FBd3) . And Figure 2 6 shows the 
position of different clones studied in the MSRV-1 pol* 

25 region. 

By means of the clone GM3 described above, a 
possible reading frame could be defined, covering the 
whole of the pol gene, referenced according to SEQ ID 
NO: 57, shown in the successive Figures 27a to 27c. 

30 

EXAMPLE 11: DETECTION OF ANTI-MSRV-1 SPECIFIC 
ANTIBODIES IN HUMAN SERUM 

Identification of the sequence of the pol gene 
of the MSRV-1 retrovirus and of an open reading frame of 
35 this gene enabled the amino acid sequence SEQ ID NO: 39 of 
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a region of the said gene, referenced SEQ ID NO: 40, to be 
determined (see Figure 28) . 

Different synthetic peptides corresponding to 
fragments of the protein sequence of MSRV-1 reverse 
5 transcriptase encoded by the pol gene were tested for 
their antigenic specificity with respect to sera of 
patients suffering from MS and of healthy controls* 

The peptides were synthesized chemically by 
solid-phase synthesis according to the Merrifield tech- 

10 nique (Barany G, and Merrifielsd R.B, 1980, In the 
Peptides, 2, 1-284, Gross E and Meienhofer J, Eds., 
Academic Press, New York) . The practical details are those 
described below. 

a) Peptide synthesis: 

15 The peptides were synthesized on a phenylacet- 

amidomethyl (PAM) /polystyrene/divinylbenzene resin 

(Applied Biosystems, Inc. Foster City, CA) , using an 
"Applied Biosystems 430A" automatic synthesizer. The amino 
acids are coupled in the form of hydroxybenzotriazole 

2 0 (HOBT) esters. The amino acids used are obtained from 
Novabiochem (Lauf lerlf ingen, Switzerland) or Bachem 
(Bubendorf , Switzerland) . 

The chemical synthesis was performed using a 
double coupling protocol with N-methylpyrrolidone (NMP) as 

25 solvent. The peptides were cut from the resin, as well as 
the side-chain protective groups, simultaneously, using 
hydrofluoric acid (HF) in a suitable apparatus (type I 
cleavage apparatus, Peptide Institute, Osaka, Japan). 

For 1 g of peptidyl resin, 10 ml of HF, 1 ml of 

30 anisole and 1 ml of dimethyl sulphide 5DMS are used. The 
mixture is stirred for 45 minutes at -2°C. The HF is then 
evaporated off under vacuum. After intensive washes with 
ether, the peptide is eluted from the resin with 10% 
acetic acid and then lyophilized. 

35 The peptides are purified by preparative high 

performance liquid chromatography on a VYDAC C18 type 
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column (250 x 21 mm) (The Separation Group , Hesperia, CA, 
USA) . Elution is carried out with an acetonitrile gradient 
at a flow rate of 22 ml/min. The fractions collected are 
monitored by an elution under isocratic conditions on a 
5 VYDAC® C18 analytical column (250 x 4.6 mm) at a flow rate 
of 1 ml/min. Fractions having the same retention time are 
pooled and lyophilized. The preponderant fraction is then 
analysed by analytical high performance liquid 
chromatography with the system described above* The 

10 peptide which is considered to be of acceptable purity 
manifests itself in a single peak representing not less 
than 95% of the chromatogram . 

The purified peptides are then analysed with the 
object of monitoring their amino acid composition, using. 

15 an Applied Biosystems 420H automatic amino acid analyser. 
Measurement of the (average) chemical molecular mass of 
the peptides is obtained using LSIMS mass spectrometry in 
the positive ion mode on a VG. ZAB.ZSEQ double focusing 
instrument connected to a DEC-VAX 2 000 acquisition system 

20 (VG analytical Ltd, Manchester, England) . 

The reactivity of the different peptides was 
tested against sera of patients suffering from MS and 
against sera of healthy controls. This enabled a peptide 
designated P0L2B to be selected, whose sequence is shown 

25 in Figure 28 in the identifier SEQ ID NO: 39, below, 
encoded by the pol gene of MSRV-1 (nucleotides 181 to 
330) . 

b) Antigenic properties: 

The antigenic properties of the P0L2B peptide 
30 were demonstrated according to the ELISA protocol 
described below. 

The lyophilized P0L2B peptide was dissolved in 
sterile distilled water at a concentration of 1 mg/ml. 
This stock solution was aliquoted and kept at +4°C for use 
35 over a fortnight, or frozen at -20°C for use within 2 
months. An aliquot is diluted in PBS (phosphate buffered 
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saline) solution so as to obtain a final peptide 
concentration of 1 microgram/ml . 100 microlitres of this 
dilution are placed in each well of microtitration plates 
("high-binding" plastic, COSTAR ref : 3590). The plates are 
5 covered with a "plate-sealer" type adhesive and kept 
overnight at +4°C for the phase of adsorption of the 
peptide to the plastic. The adhesive is removed and the 
plates are washed three times with a volume of 300 micro- 
litres of a solution A (IX PBS, 0.05% Tween 20®), then 

10 inverted over an absorbent tissue. The plates thus drained 
are filled with 200 microlitres per well of a solution B 
(solution A + 10% of goat serum) , then covered with an 
adhesive and incubated for 4 5 minutes to 1 hour at 37 °C. 
The plates are then washed three times with the solution A 

15 as described above. 

The test serum samples are diluted beforehand to 
1/50 in the solution B, and 100 microlitres of each dilute 
test serum are placed in the wells of each microtitration 
plate. A negative control is placed in one well of each 

20 plate, in the form of 100 microlitres of buffer B. The 
plates covered with an adhesive are then incubated for 1 
to 3 hours at 37 °C. The plates are then washed three times 
with the solution A as described above. In parallel, a 
peroxidase-labelled goat antibody directed against human 

25 IgG (Sigma Immunochemicals ref. A6029) or IgM (Cappel ref. 
55228) is diluted in the solution B (dilution 1/5000 for 
the anti-IgG and 1/1000 for the anti-IgM) . 100 microlitres 
of the appropriate dilution of the labelled antibody are 
then placed in each well of the microtitration plates, and 

30 the plates covered with an adhesive are incubated for 1 to 
2 hours at 37 °C. A further washing of the plates is then 
performed as described above. In parallel, the peroxidase 
substrate is prepared according to the directions of the 
"Sigma fast OPD kit" (Sigma Immunochemicals, ref. P9187) . 

35 100 microlitres of substrate solution are placed in each 
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well, and the plates are placed protected from light for 
2 0 to 3 0 minutes at room temperature. 

When the colour reaction has stabilized, the 
plates are placed immediately in an ELISA plate 
5 spectrophotometric reader, and the optical density (OD) of 
each well is read at a wavelength of 492 nm. Alter- 
natively, 30 microlitres of IN HC1 are placed in each well 
to stop the reaction, and the plates are read in the 
spectrophotometer within 24 hours. 

10 The serological samples are introduced in dupli- 

cate or in triplicate, and the optical density (OD) 
corresponding to the serum tested is calculated by taking 
the mean of the OD values obtained for the same sample at 
the same dilution. 

15 The net OD of each serum corresponds to the mean 

OD of the serum minus the mean OD of the negative control 
(solution B: PBS, 0.05% Tween 20®, 10% goat serum). 

c) Detection of anti-MSRV-1 IgG antibodies by 

ELISA: 

20 The technique described above was used with the 

POLB2 peptide to test for the presence of anti-MSRV-1 
specific. IgG antibodies in the serum of 29 patients for 
whom a definite or probable diagnosis of MS was estab- 
lished according to the criteria of Poser (23) , and of 32 

25 healthy controls (blood donors) . 

Figure 2 9 shows the results for each serum 
tested with an anti-IgG antibody. Each vertical bar 
represents the net optical density (OD at 492 nm) of a 
serum tested. The ordinate axis gives the net OD at the 

30 top of the vertical bars. The first 29 vertical bars lying 
to the left of the vertical broken line represent the sera 
of 29 cases of MS tested, and the 32 vertical bars lying 
to the right of the vertical broken line represent the 
sera of 32 healthy controls (blood donors) . 

35 The mean of the net OD values for the MS sera 

tested is 0*62. The diagram enables 5 controls to be 
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revealed whose net OD rises above the grouped values of 
the control population. These values may represent the 
presence of specific IgGs in symptomless seropositive 
patients. Two methods were hence evaluated in order to 
5 determine the statistical threshold of positivity of the 
test. 

The mean of the net OD values for the controls, 
including the controls with high net OD values, is 0.36. 
Without the 5 controls whose net OD values are greater 
10 than or equal to 0.5, the mean of the "negative" controls 
is 0.33. The standard deviation of the negative controls 
is 0.10. A theoretical threshold of positivity may be 
calculated according to the formula: 

threshold value (mean of the net OD values of the 
15 seronegative controls) + (2 or 3 x standard deviation of 

the net OD values of the seronegative controls) . 

In the first case, there are considered to be 

symptomless seropositives, and the threshold value is 

equal to 0.33 + (2 x 0.10) = 0.53. The negative results 
20 represent a non-specific "background" of the presence of 

antibodies directed specifically against an epitope of the 

peptide. 

In the second case, if the set of controls 
consisting of blood donors in apparent good health is 

25 taken as a reference basis, without excluding the sera 
which are, on the face of it, seropositive, the standard 
deviation of the "non-MS controls" is 0.116. The threshold 
value then becomes 0.36 + (2 x 0.116) = 0.59. 

According to this analysis, the test is specific 

30 for MS. In this respect, it is seen that the test is 
specific for MS, since, as shown in Table 1, no control 
has a net OD above this threshold. In fact, this result 
reflects the fact that the antibody titres in patients 
suffering from MS are, for the most part, higher than in 

35 healthy controls who have been in contact with MSRV-1. 
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TABLE No, 1 



MS 



CONTROLS 



10 



15 



20 



25 



30 



0 
1 
0 
0 , 
0 . 
0 . 
0 . 
0 , 
0 . 
0 . 
0 . 
0 . 
0 . 
0 . 
0 . 



681 
0425 
5675 
63 
588 
645 
6635 
576 
7765 
5745 
513 
4325 
7255 
859 
6435 
0 . 5795 
0 .8655 
671 
596 
662 
602 
525 
53 
565 
517 
607 
3705 
397 



0 .3515 



0 . 4395 



0 
0 

0 , 
0 . 
0 , 
0 . 



. 56 
.3565 
449 
,2825 
55 
52 
0 .2535 
0 .55 
0 .51 
0 .426 
0 .451 
0.227 
0 .3905 
0 .265 
0.4295 
.291 
347 
,4495 
3725 
0 .181 
0 . 2725 
0 .426 
0. 1915 
0 .222 
0.395 
34 
,307 
219 
491 
2265 
2605 



0 

0 . 

0 

0. 



0 
0 
0 
0 
0 
0, 



35 MEAN 0.62 

STD DEV 0 . 14 
THRESHOLD VALUE 



0 . 33 
0 . 10 
0 . 53 
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In accordance with the first method of calcula- 
tion, and as shown in Figure 29 and in the corresponding 
Table 1, 26 of the 29 MS sera give a positive result (net 
OD greater than or equal to 0.50), indicating the presence 
5 of IgGs specifically directed against the POL2B peptide, 
hence against a portion of the reverse transcriptase 
enzyme of the MSRV-1 retrovirus encoded by its pol gene, 
and consequently against the MSRV-1 retrovirus. Thus, 
approximately 90% of the MS patients tested have reacted 

10 against an epitope carried by the POL2B peptide and 
possess circulating IgGs directed against the latter. 

Five out of 32 blood donors in apparent good 
health show a positive result. Thus, it is apparent that 
approximately 15% of the symptomless population may have 

15 been in contact with an epitope carried by the POL2B 
peptide under conditions which have led to an active 
immunization which manifests itself in the persistence of 
specific serum IgGs. These conditions are compatible with 
an immunization against the MSRV-1 retrovirus reverse 

20 transcriptase during an infection with (and/or reactiva- 
tion of) the MSRV-1 retrovirus. The absence of apparent 
neurological pathology recalling MS in these seropositive 
controls may indicate that they are healthy carriers and 
have eliminated an infectious virus after immunizing 

25 themselves, or that they constitute an at-risk population 
of chronic carriers. In effect, epidemiological data 
showing that a pathogenic agent present in the environment 
of regions of high prevalence of MS may be the cause of 
this disease imply that a fraction of the population free 

30 from MS has necessarily been in contact with such a 
pathogenic agent. It has been shown that the MSRV-1 
retrovirus constitutes all or part of this "pathogenic 
agent" at the source of MS, and it is hence normal for 
controls taken from a healthy population to possess IgG 

35 type antibodies against components of the MSRV-1 
retrovirus. Thus, the difference in seroprevalence between 
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the MS and control populations is extremely significant: 
"chi-squared" test, p < 0.001. These results hence point 
to an aetiopathogenic role of MSRV-1 in MS. 

d) Detection of anti-MSRV-1 IgM antibodies by 

5 ELISA: 

The ELISA technique with the POL2B peptide was 
used to test for the presence of anti-MSRV-1 IgM specific 
antibodies in the serum of 3 6 patients for whom a definite 
or probable diagnosis of MS was established according to 

10 the criteria of Poser (23), and of 42 healthy controls 
(blood donors) . 

Figure 30 shows the results for each serum tested 
with an anti-IgM antibody. Each vertical bar represents 
the net optical density (OD at 492 nm) of a serum tested... 

15 The ordinate axis gives the net OD at the top of the 
vertical bars. The first 3 6 vertical bars lying to the 
left of the vertical line cutting the abscissa axis 
represent the sera of 3 6 cases of MS tested, and the 
vertical bars lying to the right of the vertical broken 

20 line represent the sera of 42 healthy controls (blood 
donors) . The horizontal line drawn in the middle of the 
diagram represents a theoretical threshold defining the 
boundary of the positive results (in which the top of the 
bar lies above) and the negative results (in which the top 

25 of the bar lies below) . 

The mean of the net OD values for the MS cases 
tested is 0.19. 

The mean of the net OD values for the controls 

is 0.09. 

30 The standard deviation of the negative controls 

is 0.05. 

In view of the small difference between the mean 
and the standard deviation of the controls, the threshold 
of theoretical positivity may be calculated according to 
35 the formula: 
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threshold value = (mean of the net OD values of 
the seronegative controls) + (3 x standard deviation of 
the net OD values of the seronegative controls) . 

5 The threshold value is hence equal to 0.09 + 

(3 x 0.05) = 0.26; or, in practice, 0,25. 

The negative results represent a non-specific 
"background" of the presence of antibodies directed 
specifically against an epitope of the peptide. 

10 According to this analysis, and as shown in 

Figure 30 and in the corresponding Table 2, the IgM test 
is specific for MS, since no control has a net OD above 
the threshold. 7 of the 36 MS sera produce a positive IgM 
result; now, a study of the clinical data reveals that 

15 these positive sera were taken during a first attack of MS 
or an acute attack in untreated patients. It is known that 
IgMs directed against pathogenic agents are produced 
during primary infections or during reactivations follow- 
ing a latency phase of the said pathogenic agent. 

20 The difference in seroprevalence between the MS 

and control populations is extremely significant: 
"chi-squared" test, p < 0.001. 

These results point to an aetiopathogenic role 
of MSRV-1 in MS. 

25 The detection of IgM and IgG antibodies against 

the POL2B peptide enables the course of an MSRV-1 infec- 
tion and/or of the viral reactivation of MSRV-1 to be 
evaluated . 
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TABLE NO, 2 



MS 

0 . 064 
0 . 087 
0 . 044 
0 . 115 
0.089 
0 . 025 
0 . 097 
0 .108 
0 . 018 
0 .234 
0 . 274 
0 . 225 
0 .314 
0 . 522 
0 .306 
0 . 143 
0 .375 
0 . 142 
0.157 
0 . 168 
1.051 
0 . 104 
0 . 187 
0.044 
0.053 
0.153 
0.07 
0.033 
0 , 104 
0.187 
0.044 
0 . 053 
0 . 153 
0 . 07 
0 . 033 
0 . 973 



71 



CONTROLS 

0 . 243 
0 . 11 
0 . 098 
0 . 028 
0 . 094 
0 . 038 
0 . 176 
0 . 146 

0 . 049 

0 . 161 

0 . 113 

0 . 079 

0 . 093 

0 . 127 

0 . 02 

0 . 052 

0 . 062 

0 . 074 

0 . 043 

0 . 046 

0 . 041 

0 .13 

0 . 153 

0 . 107 

0 .178 
0 . 114 
0 . 078 
0 . 118 
0 . 177 
0.026 
0 . 024 
0 . 046 
0 . 116 
0 . 04 
0 . 028 
0 . 073 
0 . 008 
0 . 074 
0 . 141 
0 .219 
0 . 047 
0 . 017 



MEAN 0.19 
STD. DEV. 0.23 
THRESHOLD VALUE 



0 .09 
0. 05 
0 . 26 
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e) Search for immunodominant epitopes in the 
POL.2B peptide: 

In order to reduce the non-specific background 
and to optimize the detection of the responses of the 
5 anti-MSRV-1 antibodies, the synthesis of octapeptides , 
advancing in successive one amino acid steps, covering the 
whole of the sequence determined by POL2B, was carried out 
according to the protocol described below. 

The chemical synthesis of overlapping octapep- 
10 tides covering the amino acid sequence 61-110 shown in the 
identifier SEQ ID NO: 39 was carried out on an activated 
cellulose membrane according to the technique of BERG et 
al. (1989. J. Ann. Chem. Soc. , HI, 8024-8026) marketed by 
Cambridge Research Biochemicals under the trade name 
15 Spotscan. This technique permits the simultaneous 
synthesis of a large number of peptides and their 
analysis. 

The synthesis is carried out with esterified 
amino acids in which the a-amino group is protected with 

20 an FMOC group (Nova Biochem) and the side-chain groups 
with protective groups such as trityl, t-butyl ester or t- 
butyl ether. The esterified amino acids are solubilized in 
N-methylpyrrolidone (NMP) at a concentration of 300 nM, 
and 0.9 ml are applied to spots of deposit of bromophenol 

25 blue. After incubation for 15 minutes, a further 
application of amino acids is carried out according to 
another 15-minute incubation. If the coupling between two 
amino acids has taken place correctly, a coloration 
modification (change from blue to yellow-green) is 

30 observed. After three washes in DMF , an acetylation step 
is performed with acetic anhydride. Next, the terminal 
amino groups of the peptides in the process of synthesis 
are deprotected with 20% pyridine in DMF. The spots of 
deposit are restained with a 1% solution of bromophenol 

35 blue in DMF, washed three times with methanol and dried. 
This set of operations constitutes one cycle of addition 
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of an amino acid, and this cycle is repeated until the 
synthesis is complete. When all the amino acids have been 
added, the NH2 -terminal group of the last amino acid is 
deprotected with 20% piperidine in DMF and acetylated with 
5 acetic anhydride. The groups protecting the side chain are 
removed with a dichloromethane/trif luoroacetic 

acid/triisobutylsilane (5 ml/5 ml/250 ml) mixture. The 
immunoreactivity of the peptides is then tested by ELISA. 



10 duplicate on two different membranes, the latter are 
rinsed with methanol and washed in TBS (0.1M Tris pH 7.2), 
then incubated overnight at room temperature in a 
saturation buffer. After several washes in TBS-T (0.1M 
Tris pH 7.2 - 0.05% Tween 20), one membrane is incubated 

15 with a 1/50 dilution of a reference serum originating from 
a patient suffering from MS, and the other membrane with a 
1/50 dilution of a pool of sera of healthy controls. The 
membranes are incubated for 4 hours at room temperature. 
After washes with TBS-T, a fi-galactosidase-labelled anti- 

20 human immunoglobulin conjugate (marketed by Cambridge 
Research Biochemicals) is added at a dilution of 1/200., 
and the mixture is incubated for two hours at room 
temperature. After washes of the membranes with 0.05% TBS- 
T and PBS, the immunoreactivity in the different spots is 

25 visualized by adding 5-bromo-4-chloro-3-indolyl 0-D- 
galactopyranoside in potassium. The intensity of 
coloration of the spots is estimated qualitatively with a 
relative value from 0 to 5 as shown in the attached 
Figures 31 to 33. 

30 In this way, it is possible to determine two 

immunodominant regions at each end of the POL2B peptide, 
corresponding, respectively, to the amino acid sequences 
65-75 (SEQ ID N0:41) and 92-109 (SEQ ID NO:42), according 
to Figure 34, and lying, respectively, between the 

35 octapeptides Phe-Cys-Ile-Pro-Val-Arg-Pro-Asp (FCIPVRPD) 
and Arg-Pro-Asp-Ser-Gln-Phe-Leu-Phe (RPDSQFLF) , and 



After synthesis of the different octapeptides in 
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Thr-Val-Leu-Pro-Gln-Gly-Phe-Arg (TVLPQGFR) and Leu-Phe- 
Gly-Gln-Ala-Leu-Ala-Gln (LFGQALAQ) , and a region which is 
less reactive but apparently more specific, since it does 
not produce any background with the control serum, 
5 represented by the octapeptides Leu-Phe-Ala-Phe-Glu-Asp- 
Pro-Leu (LFAFEDPL) (SEQ ID NO: 43) and Phe-Ala-Phe-Glu-Asp- 
Pro-Leu-Asn (FAFEDPLN) (SEQ ID NO:44). 

These regions make it possible to define new 
peptides which are more specific and more immunoreactive 

10 according to the usual techniques . 

It is thus possible, as a result of the 
discoveries made and the methods developed by the inven- 
tors, to carry out a diagnosis of MSRV-i infection and/or 
reactivation and to evaluate a therapy in MS on the basis 

15 of its efficacy in "negativing 11 the detection of these 
agents in the patients' biological fluids. Furthermore, 
early detection in individuals not yet displaying neuro- 
logical signs of MS could make it possible to institute a 
treatment which would be all the more effective with 

2 0 respect to the subsequent clinical course for the fact 
that it would precede the lesion stage which corresponds 
to the onset of neurological disorders. Now, at the 
present time, a diagnosis of MS cannot be established 
before a symptomatology of neurological lesions has set 

25 in, and hence no treatment is instituted before the 
emergence of a clinical picture suggestive of lesions of 
the central nervous system which are already significant. 
The diagnosis of an MSRV-1 and/or MSRV-2 infection and/or 
reactivation in man is hence of decisive importance, and 

30 the present invention provides the means of doing this. 

It is thus possible, apart from carrying out a 
diagnosis of MSRV-1 infection and/or reactivation, to 
evaluate a therapy in MS on the basis of its efficacy in 
••negativing" the detection of these agents in the 

35 patients' biological fluids. 
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EXAMPLE 12: OBTAINING A CLONE LB19 CONTAINING A 
PORTION OF THE gag GENE OF THE MSRV-1 RETROVIRUS 

A PCR technique derived from the technique 
published by Gonzalez-Quintial R et al. (19) and PLAZA et 
5 al. (25) was used. From the total RNAs extracted from a 
fraction of virion purified as described above, the cDNA 
was synthesized using a specific primer (SEQ ID No, 64) at 
the 3' end of the genome to be amplified, using EXPAND™ 
REVERSE TRANSCRIPTASE (BOEHRINGER MANNHEIM) . 

10 

cDNA: 

AAGGGGCATG GACGAGGTGG TGGCTTATTT (SEQ ID NO: 65) 
(antisense) 

15 After purification, a poly(G) tail was added at 

the 5' end of the cDNA using the "Terminal transferases 
kit" marketed by the company Boehringer Mannheim, 
according to the manufacturer's protocol. 

An anchoring PCR was carried out using the 

20 following 5* and 3 1 primers: 

AGATCTGCAG AATT C GAT AT CACCCCCCCC CCCCCC (SEQ ID No. 91) 
(sense) , and AAATGTCTGC GGCACCAATC TCCATGTT 

(SEQ ID No. 64) (antisense) 

Next, a semi-nested anchoring PCR was carried 

25 out with the following 5' and 3' primers: 

AGATCTGCAG AATTCGATAT CA (SEQ ID No. 92) (sense), and 

AAATGTCTGC GGCACCAATC TCCATGTT (SEQ ID No. 64) (antisense) 

The products originating from the PCR were 
purified after purification on agarose gel according to 

3 0 conventional methods (17) , and then resuspended in 
10 microlitres of distilled water. Since one of the 
properties of Taq polymerase consists in adding an adenine 
at the 3' end of each of the two DNA strands, the DNA 
obtained was inserted directly into a plasmid using the TA 

35 Cloning™ kit (British Biotechnology) . The 2 Ml of DNA 
solution were mixed with 5 pi of sterile distilled water, 
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1 Ml of 10-fold concentrated ligation buffer "lOx LIGATION 
BUFFER" , 2 Ml of "pCR™ VECTOR" (25 ng/ml) and 1 til of "T4 
DNA LIGASE" . This mixture was incubated overnight at 12 °C. 
The following steps were carried out according to the 
5 instructions of the TA Cloning™ kit (British 
Biotechnology) . At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 

10 "miniprep" procedure (17). The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 

15 sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning Kit™. The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit "Prism ready 

20 reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref . 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 

25 PCR amplification according to the technique 

mentioned above was used on a cDNA synthesized from the 
nucleic acids of fractions of infective particles purified 
on a sucrose gradient, according to the technique 
described by H. Perron (13), from culture supernatants of 

30 B lymphocytes of a patient suffering from MS, immortalized 
with Epstein-Barr virus (EBV) strain B95 and expressing 
retroviral particles associated with reverse transcriptase 
activity as described by Perron et al. (3) and in French 
Patent Applications MS 10, 11 and 12. the clone LB19, 

35 whose sequence, identified by SEQ ID NO: 59, is presented 
in Figure 35. 
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The clone makes it possible to define, with the 
clone GM3 previously sequenced and the clone G+E+A (see 
Example 15) , a region of 690 base pairs representative of 
a significant portion of the gag gene of the MSRV-1 
5 retrovirus, as presented in Figure 36. This sequence 
designated SEQ ID NO: 88 is reconstituted from different 
clones overlapping at their ends. This sequence is 
identified under the name MSRV-1 "gag*" region. In Figure 
36, a potential reading frame with the translation into 
10 amino acids is presented below the nucleic acid sequence. 

EXAMPLE 13: OBTAINING A CLONE FBdl3 CONTAINING A 
pol GENE REGION RELATED TO THE MSRV-1 RETROVIRUS AND AN 
APPARENTLY INCOMPLETE ENV REGION CONTAINING A POTENTIAL 
15 READING FRAME (ORF) FOR A GLYCOPROTEIN 

Extraction of viral RNAs : The RNAs were 
extracted according to the method briefly described below.,; 

A pool of culture supernatant of B lymphocytes 
of patients suffering from MS (650 ml) is centrifuged for 
20 30 minutes at 10,000 g. The viral pellet obtained is 
resuspended in 300 microlitres of PBS/10 mM MgCl 2 • The 

material is treated with a DNAse (100 mg/ml)/RNAse 
(50 mg/ml) mixture for 30 minutes at 37 °C and then with 
proteinase K (50 mg/ml) for 30 minutes at 46°C. 
25 The nucleic acids are extracted with one volume 

of a phenol/0.1% SDS (V/V) mixture heated to 60°C, and 
then re-extracted with one volume of phenol/ chloroform 
(l:l; V/V) . 

Precipitation of the material is performed with 
3 0 2.5 V of ethanol in the presence of 0.1 V of sodium 
acetate pH5.2. The pellet obtained after centrif ugation is 
resuspended in 50 microlitres of sterile DEPC water. 

The sample is treated again with 50 mg/ml of 
"RNAse free" DNAse for 3 0 minutes at room temperature, 
35 extracted with one volume of phenol/chloroform and 
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precipitated in the presence of sodium acetate and 
ethanol , 

The RNA obtained is quantified by an OD reading 
at 260 nm. The presence of MSRV-1 and the absence of DNA 
5 contaminant is monitored by a PCR and an MSRV-l-specif ic 
RTPCR associated with a specific ELOSA for the MSRV-1 
genome. 

Synthesis of cDNA: 

5 mg of RNA are used to synthesize a cDNA primed 
10 with a poly(DT) oligonucleotide according to the 

instructions of the "cDNA Synthesis Module" kit (ref 

RPN 1256, Amersham) with a few modifications: The reverse 

transcription is performed at 45 °C instead of the 

recommended 42 °C. 
15 The synthesis product is purified by a double 

extraction and a double purification according to the 

manufacturer 1 s instructions . 

The presence of MSRV-1 is verified by an MSRV-1 

PCR associated with a specific ELOSA for the MSRV-1 
20 genome. 

"Long Distance PCR": (LD-PCR) 

500 ng of cDNA are used for the LD-PCR step 
(Expand Long Template System; Boehringer (ref. 1681 842)). 

Several pairs of oligonucleotides were used. 
25 Among these, the pair defined by the following primers: 
5 1 primer: GGAGAAGAGC AGCATAAGTG G (SEQ ID NO: 66) 
3 1 primer: GTGCTGATTG GTGTATTTAC AATCC (SEQ ID NO: 67). 

The amplification conditions are as follows: 
94 °C 10 seconds 
30 56°C 30 seconds 

68°C 5 minutes; 
10 cycles, then 20 cycles with an increment of 
2 0 seconds in each cycle on the elongation time. At the 
end of this first amplification, 2 microlitres of the 
35 amplification product are subjected to a second 
amplification under the same conditions as before. 
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The LD-PCR reactions are conducted in a Perkin 
model 9600 PCR apparatus in thin-walled microtubes 
(Boehringer) • 

The amplification products are monitored by 
5 electrophoresis of l/5th of the amplification volume 
(10 microlitres) in 1% agarose gel. For the pair of 
primers described above, a band of approximately 1.7 Kb is 
obtained. 

Cloning of the amplified fragment: 
10 The PCR product was purified by passage through 

a preparative agarose gel and then through a Costar column 

(Spin; D . Dutcher) according to the supplier's 

instructions. 

2 microlitres of the purified solution are 
15 joined up with 50 ng of vector PCRII according to the 

supplier's instructions (TA Cloning Kit; British 

Biotechnology) ) • 

The recombinant vector obtained is isolated by 

transformation of competent DHSaF 1 bacteria. The bacteria 
20 are selected using their resistance to ampicillin and the 

loss of metabolism for Xgal (= white colonies) . The 

molecular structure of the recombinant vector is confirmed 

by plasmid minipreparation and hydrolysis with the enzyme 

EcoRl . 

25 FBdl3 , a positive clone for all these criteria, 

was selected. A large-scale preparation of the recombinant 
plasmid was performed using the Midiprep Quiagen kit (ref 
12243) according to the supplier's instructions. 

Sequencing of the clone FBdl3 is performed by 

3 0 means of the Perkin Prism Ready Amplitaq FS dye terminator 
kit (ref. 402119) according to the manufacturer's 
instructiions. The sequence reactions are introduced into 
a Perkin type 377 or 373A automatic sequencer. The 
sequencing strategy consists in gene walking carried out 

35 on both strands of the clone Fbdl3. 
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The sequence of the clone FBdl 3 is identified by 
SEQ ID NO: 58. 

In Figure 37, the sequence homology between the 
clone FBdl3 and the HSERV-9 retrovirus is shown on the 
5 matrix chart by a continuous line for any partial homology 
greater than or equal to 70%. It can be seen that there 
are homologies in the flanking regions of the clone (with 
the pol gene at the 5 1 end and with the env gene and then 
the LTR at the 3 1 end) , but that the internal region is 

10 totally divergent and does not display any homology, even 
weak, with the env gene of HSERV-9 . Furthermore, it is 
apparent that the clone FBdl3 contains a longer "env" 
region than the one which is described for the defective 
endogenous HSERV-9 ; it may thus be seen that the internal 

15 divergent region constitutes an "insert" between the 
regions of partial homology with the HSERV-9 defective 
genes . 

This additional sequence determines a potential 
orf, designated ORF B13 , which is represented by its amino 
20 acid sequence SEQ ID NO: 87, 

The molecular structure of the clone FBdl 3 was 
analyzed using the GeneWork software and Genebank and 
SwissProt data banks. 

5 glycosylation sites were found. 
25 The protein does not have significant homology 

with already known sequences. 

It is probable that this clone originates from a 
recombination of an endogenous retroviral element (ERV) , 
linked to the replication of MSRV-1. 
3 0 Such a phenomenon does not lack generation of 

the expression of polypeptides, or even of endogenous 
retroviral proteins which are not necessarily tolerated by 
the immune system. Such a scheme of aberrant expression of 
endogenous elements related to MSRV-l and/or induced by 
35 the latter is liable to multiply the aberrant antigens, 
and hence tends to contribute to the induction of 
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autoimmune processes such as are observed in MS. It 
clearly constitutes a novel element never hitherto 
described. In effect, interrogation of the data banks of 
nucleic acid sequences available in version No. 19 (1996) 
5 of the "Entrez" software (NCBI, NIH, Bethesda, USA) did 
not enable a known homologous sequence comprising the 
whole of the env region of this clone to be identified. 



EXAMPLE 14: OBTAINING A CLONE FP6 CONTAINING A 
10 PORTION OF THE pol GENE, WITH A REGION CODING FOR THE 
REVERSE TRANSCRIPTASE ENZYME HOMOLOGOUS TO THE CLONE POL* 
MSRV-1, AND A S'pol REGION DIVERGENT FROM THE EQUIVALENT 
SEQUENCES DESCRIBED IN THE CLONES POL*, tpol, FBd3 , JLBcl 
and JLBC2 

15 A 3'RACE was performed on total RNA extracted 

from plasma of a patient suffering from MS. A healthy 
control plasma treated under the same conditions was used 
as negative control. The synthesis of cDNA was carried out 
with the following modified oligo(dT) primer: 

20 5' GACTCGCTGC AGATCGATTT TTTTTTTTTT TTTT 3' (SEQ ID NO: 68) 
and Boehringer "Expand RT" reverse transcriptase 
according to the conditions recommended by the company. A 
PCR was performed with the enzyme Klentaq (Clontech) under 
the following conditions: 94 °C 5 min then 93 °C 1 min, 58 °C 

25 1 min, 68°C 3 min for 40 cycles and 68°C for 8 rain, and 
with a final reaction volume of 50 /zl. 

Primers used for the PCR: 

- 5' primer, identified by SEQ ID NO: 69 
5 1 GCCATCAAGC CACCCAAGAA CTCTTAACTT 3 ' ; 
30 - 3' primer, identified by SEQ ID NO: 68 (=the 

same as for the cDNA) 

A second, so-called "semi-nested" PCR was 
carried out with a 5' primer located within the region 
already amplified. This second PCR was performed under the 
35 same experimental conditions as those used in the first 
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PCR, using 10 jil of the amplification product originating 
from the first PCR* 

Primers used for the semi-nested PCR: 

- 5' primer, identified by SEQ ID NO: 70 
5 5 ' CCAATAGCCA GACCATTATA TACACTAATT 3 1 ; 

- 3 1 primer, identified by SEQ ID NO: 68 (=the 
same as for the cDNa) 

Primers SEQ ID NO: 69 and SEQ ID NO: 70 are 
specific for the pol* region: position No. 403 to No. 422 

10 and No. 641 to No. 670, respectively. 

An amplification product was thus obtained from 
the extracellular RNA extracted from the plasma of a 
patient suffering from MS. The corresponding fragment was 
not observed for the plasma of the healthy control. This 

15 amplification product was cloned in the following manner. 

The amplified DNA was inserted into a plasmid 
using the TA Cloning™ kit. The 2 m! of DNA solution were 
mixed with 5 Ml of sterile distilled water, 1 /il of a 
10-fold concentrated ligation buffer "10x LIGATION 

20 BUFFER", 2 Ml of n pCR™ VECTOR" (25 ng/ml) and 1 Ml of 
"TA DNA LIGASE" . This mixture was incubated overnight at 
12 °C. The following steps were carried out according to 
the instructions of the TA Cloning™ kit (British 
Biotechnology) . At the end of the procedure, the white 

25 columns of recombinant bacteria (white) were picked out in 
order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 
"miniprep" procedure (17) . The plasmid preparation from 
each recombinant colony was cut with a suitable 

30 restriction enzyme and analyzed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide was selected for 
sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 

35 cloning plasmid of the TA cloning kit™. The reaction prior 
to sequencing was then performed according to the method 
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recommended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref, 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
5 "Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 

The clone obtained, designated FP6, enables a 
region of 467 bp which is 89% homologous to the pol* 
region of the MSRV-1 retrovirus and a region of 1167 bp 

10 which is 64% homologous to the pol region of ERV-9 
(No, 1634 to 2856) to be defined. 

The clone FP6 is represented in Figure 38 by its 
nucleotide sequence identified by SEQ ID NO: 61. The three 
potential reading frames of this clone are indicated by 

15 their amino acid sequence under the nucleotide sequence. 

EXAMPLE 15: OBTAINING A REGION DESIGNATED G+E+A 
CONTAINING AN ORF FOR A RETROVIRAL PROTEASE, BY PCR 
AMPLIFICATION OF THE NUCLEIC ACID SEQUENCE CONTAINED 
20 BETWEEN THE 5' REGION DEFINED BY THE CLONE "GM3" AND THE 
3 1 REGION DEFINED BY THE CLONE POL*, FROM THE RNA 
EXTRACTED FROM A POOL OF PLASMAS OF PATIENTS SUFFERING 
FROM MS 

Oligonucleotides specific for the MSRV-1 
25 sequences already identified by the Applicant were defined 
in order to amplify the retroviral RNA originating from 
virions present in the plasma of patients suffering from 
MS- Control reactions were performed so as to monitor the 
presence of contaminants (reaction with water) . The 
30 amplification consists of a step of RT-PCR followed by a 
"nested" PCR- Pairs of primers were defined for amplifying 
three overlapping regions (designated G, E and A) on the 
regions defined by the sequences of the clones GM3 and 
pol* described above. 

35 

Semi-nested RT-PCR for amplification of the region G: 
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- in the first RT-PCR cycle, the following 
primers are used: 

primer 1: SEQ ID NO: 71 (sense) 
primer 2: SEQ ID NO: 72 (antisense) 
5 - in the second PCR cycle, the following primers 

are used: 

primer 1: SEQ ID NO: 73 (sense) 
primer 4: SEQ ID NO: 74 (antisense) 

Nested RT-PCR for amplification of the region E: 
10 - in the first RT-PCR cycle, the following 

primers are used: 

primer 5: SEQ ID NO: 75 (sense) 
primer 6: SEQ ID NO: 76 (antisense) 

- in the second PCR cycle, the following primers 

15 are used: 

primer 7: SEQ ID NO: 77 (sense) 
primer 8: SEQ ID NO: 78 (antisense) 
Semi-nested RT-PCR for amplification of the region A: 

- in the first RT-PCR cycle, the following 
20 primers are used: 

primer 9: SEQ ID NO: 79 (sense) 
primer 10: SEQ ID NO: 80 (antisense) 

- in the second PCR cycle, the following primers 

are used: 

25 primer 9: SEQ ID NO: 81 (sense) 

primer 11: SEQ ID NO: 82 (antisense) 
The primers and the regions G, E and A which 
they define are positioned as follows: 
cDNA 



3 0 1 G 4 2 

5 7 E 8 6 

3 A 11 10 

< >< > 

GM3 POL* 
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The sequence of the region defined by the 
different clones G, E and A was determined after cloning 
and sequencing of the "nested" amplification products. 

The clones G, E and A were assembled together by 
5 PCR with the primers 1 at the 5' end of the fragment G and 
11 at the 3' end of the fragment A, the primers being 
described above. An approximately 158 0-bp fragment G+E+A 
was amplified and inserted into a plasmid using the TA 
Cloning (trademark) kit. The sequence of the amplification 
10 product corresponding to G+E+A was determined and analysis 
of the G+E and E+A overlaps was carried out. The sequence 
is shown in Figure 39, and corresponds to the sequence SEQ 
ID NO:89. 

A reading frame coding for an MSRV-1 retroviral.. 
15 protease was found in the region E. The amino acid 
sequence of the protease, identified by SEQ ID NO: 90, is 
presented in Figure 40. 

EXAMPLE 16: OBTAINING A CLONE LTRGAG12 , RELATED 

2 0 TO AN ENDOGENOUS RETROVIRAL ELEMENT (ERV) CLOSE TO MSRV-1, 

IN THE DNA OF AN MS LYMPHOBLASTOID LINE PRODUCING VIRIONS 
AND EXPRESSING THE MSRV-1 RETROVIRUS 

A nested PCR was performed on the DNA extracted 
from a lymphoblastoid line (B lymphocytes immortalized 
25 with the EBV virus strain B95, as described above and as 
is well known to a person skilled in the art) expressing 
the MSRV-l retrovirus and originating from peripheral 
blood lymphocytes of a patient suffering from MS. 

In the first PCR step, the following primers are 

3 0 used: 

primer 4 327: CTCGATTTCT TGCTGGGCCT TA (SEQ ID NO: 83) 
primer 3 512: GTTGATTCCC TCCTCAAGCA (SEQ ID NO: 84) 

This step comprises 3 5 amplification cycles with 
the following conditions: 1 min at 94 °C, 1 min at 54°C and 
35 4 min at 72°C. 
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In the second PCR step, the following primers 

are used: 

primer 4294: CTCTACCAAT CAGCATGTGG (SEQ ID NO:85) 
primer 3591: TGTTCCTCTT GGTCCCTAT (SEQ ID NO: 86) 

5 This step comprises 35 amplification cycles with 

the following conditions : 1 min at 94 °C, 1 min at 54 °C and 

4 min at 72°C. 

The products originating from the PCR were 
purified after purification on agarose gel according to 

10 conventional methods (17), and then resuspended in 10 ml 
of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3 ' end of 
each of the two DNA strands , the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 

15 Biotechnology) • The 2 /xl of DNA solution were mixed with 

5 /xl of sterile distilled water , 1 pi of a 10- fold 
concentrated ligation buffer "10x LIGATION BUFFER", 2 /il 
of "pCR™ VECTOR" (25 ng/ml) and 1 Ml of "TA DNA LIGASE" . 
This mixture was incubated overnight at 12 °C. The 

20 following steps were carried out according to the 
instructions of the TA Cloning™ kit (British 
Biotechnology) . At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 

25 plasmids incorporated according to the so-called 
"miniprep" procedure (17). The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analyzed on agarose gel . The 
plasmids possessing an insert detected under UV light 

30 after staining the gel with ethidium bromide were selected 
for sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning Kit™. The reaction prior 
to sequencing was then performed according to the method 

35 recommended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 
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(Applied Biosystems, ref . 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 
5 Thus, a clone designated LTRGAG12 could be 

obtained, and is represented by its internal sequence 
identified by SEQ ID NO: 60. 

This clone is probably representative of 
endogenous elements close to ERV-9 , present in human DNA, 

10 in particular in the DNA of patients suffering from MS, 
and capable of interfering with the expression of the 
MSRV-1 retrovirus, hence capable of having a role in the 
pathogenesis associated with the MSRV-1 retrovirus and 
capable of serving as marker for a specific expression in 

15 the pathology in question. 

EXAMPLE 17: DETECTION OF ANTI-MSRV-1 SPECIFIC 
ANTIBODIES IN HUMAN SERUM 

Identification of the sequence of the pdl gene 
20 of the MSRV-1 retrovirus and of an open reading frame of 
this gene enabled the amino acid sequence SEQ ID NO: 63 of 
a region of the said gene, referenced SEQ ID NO: 62, to be 
determined . 

Different synthetic peptides corresponding to 
25 fragments of the protein sequence of MSRV-1 reverse 
transcriptase encoded by the pol gene were tested for 
their antigenic specificity with respect to sera of 
patients suffering from MS and of healthy controls. 

The peptides were synthesized chemically by 
30 solid-phase synthesis according to the Merrifield tech- 
nique (22) . The practical details are those described 
below. 

a) Peptide synthesis: 

The peptides were synthesized on a phenylacet- 
35 amidomethyl (PAM) /polystyrene/divinylbenzene resin 

(Applied Biosystems, Inc. Foster City, CA) , using an 
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"Applied Biosystems 4 3 OA" automatic synthesizer. The amino 
acids are coupled in the form of hydroxybenzotriazole 
(HOBT) esters. The amino acids used are obtained from 
Novabiochem (Lauf lerlf ingen, Switzerland) or Bachem 
5 (Bubendorf , Switzerland) . 

The chemical synthesis was performed using a 
double coupling protocol with N-methylpyrrolidone (NMP) as 
solvent. The peptides were cut from the resin, as well as 
the side-chain protective groups, simultaneously, using 

10 hydrofluoric acid (HF) in a suitable apparatus (type I 
cleavage apparatus, Peptide Instiute, Osaka, Japan) . 

For 1 g of peptidyl resin, 10 ml of HF, 1 ml of 
anisole and 1 ml of dimethyl sulphide 5DMS are used. The 
mixture is stirred for 45 minutes at -2°C. The HF is then 

15 evaporated off under vacuum. After intensive washes with 
ether, the peptide is eluted from the resin with 10% 
acetic acid and then lyophilized. 

The peptides are purified by preparative high 
performance liquid chromatography on a VYDAC C18 type 

20 column (250 x 21 mm) (The Separation Group, Hesperia, CA, 
USA) . Elution is carried out with an acetonitrile gradient 
at a flow rate of 22 ml/min. The fractions collected are 
monitored by an elution under isocratic conditions on a 
VYDAC™ C18 analytical column (250 x 4.6 mm) at a flow rate 

25 of 1 ml/min. Fractions having the same retention time are 
pooled and lyophilized. The preponderant fraction is then 
analysed by analytical high performance liquid 
chromatography with the system described above. The 
peptide which is considered to be of acceptable purity 

30 manifests itself in a single peak representing not less 
than 9 5% of the chromatogram. 

The purified peptides are then analysed with the 
object of monitoring their amino acid composition, using 
an Applied Biosystems 420H automatic amino acid analyser. 

35 Measurement of the (average) chemical molecular mass of 
the peptides is obtained using LSIMS mass spectrometry in 
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the positive ion mode on a VG, ZAB.ZSEQ double focusing 
instrument connected to a DEC-VAX 2 000 acquisition system 
(VG analytical Ltd, Manchester, England) . 

The reactivity of the different peptides was 
5 tested against sera of patients suffering from MS and 
against sera of healthy controls. This enabled a peptide 
designated S24Q to be selected, whose sequence is 
identified by SEQ ID NO: 63, encoded by a nucleotide 
sequence of the pol gene of MSRV-1 (SEQ ID NO:62). 

10 

b) Antigenic properties: 

The antigenic properties of the S24Q peptide 
were demonstrated according to the ELISA protocol 
described below. 

15 The lyophilized S24Q peptide was dissolved in 

10 % acetic acid at a concentration of 1 mg/ml. This stock 
solution was aliquoted and kept at +4°C for use over a 
fortnight, or frozen at -20°c for use within 2 months. An 
aliquot is diluted in PBS (phosphate buffered saline) 

20 solution so as to obtain a final peptide concentration of 
5 micrograms/ml . 100 microlitres of this dilution are 
placed in each well of Nunc Maxisorb (trade name) 
microtitration plates. The plates are covered with a 
"plate-sealer" type adhesive and kept for 2 hours at +37°C 

25 for the phase of adsorption of the peptide to the plastic. 
The adhesive is removed and the plates are washed three 
times with a volume of 300 microlitres of a solution A 
(IX' PBS, 0.05% Tween 20®), then inverted over an 
absorbent tissue. The plates thus drained are filled with 

30 250 microlitres per well of a solution B (solution A + 10% 
of goat serum) , then covered with an adhesive and 
incubated for 1 hour at 37 °C. The plates are then washed 
three times with the solution A as described above. 

The test serum samples are diluted beforehand to 

35 1/100 in the solution B, and 100 microlitres of each 
dilute test serum are placed in the wells of each micro- 
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titration plate. A negative control is placed in one well 
of each plate, in the form of 100 microlitres of buffer B. 
The plates covered with an adhesive are then incubated for 
1 hour 3 0 min at 37 °C. The plates are then washed three 
5 times with the solution A as described above. For the IgG 
response, a peroxidase-labelled goat antibody directed 
against human IgG (marketed by Jackson Immuno Research 
Inc.) is diluted in the solution B (dilution .1/10,000). 
100 microlitres of the appropriate dilution of the 

10 labelled antibody are then placed in each well of the 
microtitration plates, and the plates covered with an 
adhesive are incubated for 1 hour at 37 °C. A further 
washing of the plates is then performed as described 
above. In parallel, the peroxidase substrate is prepared 

15 according to the directions of the bioMerieux kits. 100 
microlitres of substrate solution are placed in each well, 
and the plates are placed protected from light for 20 to 
3 0 minutes at room temperature. 

When the colour reaction has stabilized, 

20 50 microlitres of Color 2 (bioMerieux trade name) are 
placed in each well in order to stop the reaction. The 
plates are placed immediately in an ELISA plate 
spectrophotometric reader, and the optical density (OD) of 
each well is read at a wavelength of 492 nm. 

25 The serological samples are introduced in dupli- 

cate or in triplicate, and the optical density (OD) 
corresponding to the serum tested is calculated by taking 
the mean of the OD values obtained for the same sample at 
the same dilution. 

30 The net OD of each serum corresponds to the mean 

OD of the serum minus the mean OD of the negative control 
(solution B: PBS, 0.05% Tween 20x, 10% goat serum). 

c) Detection of anti-MSRV-1 IgG antibodies 
(S24Q) by ELISA: 

35 The technique described above was used with the 

S24Q peptide to test for the presence of anti-MSRV-1 
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specific IgG antibodies in the serum of 15 patients for 
whom a definite diagnosis of MS was established according 
to the criteria of Poser (23) , and of 15 healthy controls 
(blood donors) . 

5 Figure 41 shows the results for each serum 

tested with an anti-IgG antibody. Each vertical bar 
represents the net optical density (OD at 492 nm) of a 
serum tested. The ordinate axis gives the net OD at the 
top of the vertical bars. The first 15 vertical bars lying 

10 to the left of the vertical broken line represent the sera 
of 15 healthy controls (blood donors) , and the 15 vertical 
bars lying to the right of the vertical broken line 
represent the sera of 15 cases of MS tested- The diagram 
enables 2 controls to be revealed whose OD rises above the. 

15 grouped values of the control population. These values may; 
represent the presence of specific IgGs in symptomless, 
seropositive patients. Two methods were hence evaluated in; 
order to determine the statistical threshold of positivity 
of the test. 

20 The mean of the net OD values for the controls, 

including the controls with high net OD values, is 0.129 
and the standard deviation is 0.06. Without the 2 controls 
whose OD values are greater than 0.2, the mean of the 
"negative" controls is 0.107 and the standard deviation is 

25 0.03. A theoretical threshold of positivity may be 
calculated according to the formula: 

threshold value (mean of the net OD values of the 
negative controls) + ( 2 or 3 ' standard deviation 
30 of the net OD values of the negative controls) . 

In the first case, there are considered to be 
symptomless seropositives, and the threshold value is 
equal to 0.11 + (3 x 0.03) = 0.20. The negative results 
35 represent a non-specific "background" of the presence of 



WO 98/23755 PCT/IB97/01482 



antibodies directed specifically against an epitope of the 
peptide . 

In the second case, if the set of controls 
consisting of blood donors in apparent good health is 
5 taken as a reference basis, without excluding the sera 
which are, on the face of it, seropositive, the standard 
deviation of the "non-MS controls" is 0.116, The threshold 
value then becomes 0.13 + (3 x 0.06) = 0.31. 

According to this latter analysis, the test is 

10 specific for MS. In this respect, it is seen that the test 
is specific for MS, since, as shown in Table 1, no control 
has a net OD above this threshold. In fact, this result 
reflects the fact that the antibody titres in patients 
suffering from MS are, for the most part, higher than in 

15 healthy controls who have been in contact with MSRV-1. 

In accordance with the first method of calcula- 
tion, and as shown in Figure 41 and in Table 3, 6 of the 
15 MS sera give a positive result (OD greater than or 
equal to 0.2), indicating the presence of IgGs 

20 specifically directed against the S24Q peptide, hence 
against a portion of the reverse transcriptase enzyme of 
the MSRV-1 retrovirus encoded by its pol gene, and 
consequently against the MSRV-1 retrovirus. 

Thus, approximately 4 0% of the MS patients 

25 tested have reacted against an epitope carried by the S24Q 
peptide and possess circulating IgGs directed against the 
latter . 

Two out of 15 blood donors in apparent good 
health show a positive result. Thus, it is apparent that 

30 approximately 13% of the symptomless population may have 
been in contact with an epitope carried by the S24Q 
peptide under conditions which have led to an active 
immunization which manifests itself in the persistence of 
specific serum IgGs. These conditions are compatible with 

35 an immunization against the MSRV-l retrovirus reverse 
transcriptase during an infection with (and/or reactiva- 
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tion of) the MSRV-1 retrovirus. The absence of apparent 
neurological pathology recalling MS in these seropositive 
controls may indicate that they are healthy carriers and 
have eliminated an infectious virus after immunizing 
5 themselves, or that they constitute an at-risk population 
of chronic carriers. In effect, epidemiological data 
showing that a pathogenic agent present in the environment 
of regions of high prevalence of MS may be the cause of 
this disease imply that a fraction of the population free 

10 from MS has necessarily been in contact with such a 
pathogenic agent. It has been shown that the MSRV-1 
retrovirus constitutes all or part of this "pathogenic 
agent" at the source of MS, and it is hence normal for 
controls taken from a healthy population to possess IgG 

15 type antibodies against components of the MSRV-1 
retrovirus . 

Lastly, the detection of anti-S24Q antibodies in 
only one out of two MS cases tested here may reflect the 
fact that this peptide does not represent an 

20 immunodominant MSRV-1 epitope, that inter-individual 
strain variations may induce an immunization against a 
divergent peptide motif in the same region, or that the 
course of the disease and the treatments followed may 
modulate over time the antibody response against the S24Q 

25 peptide. 



30 



35 
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d) Detection of anti-MSRV-1 IgM antibodies by 

ELISA: 

20 The ELISA technique with the S24Q peptide was 

used to test for the presence of anti-MSRV-1 IgM specific 
antibodies in the same sera as above . 

Figure 4 2 shows the results for each serum tested 
with an anti-IgM antibody. Each vertical bar represents 

25 the net optical density (OD at 492 nm) of a serum tested. 
The ordinate axis gives the net OD at the top of the 
vertical bars. The first 15 vertical bars lying to the 
left of the vertical line cutting the abscissa axis 
represent the sera of 15 healthy controls (blood donors) , 

30 and the vertical bars lying to the right of the vertical 
broken line represent the sera of 15 cases of MS tested. 

The mean of the OD values for the MS cases 
tested is 1.6. 

The mean of the net OD values for the controls 

35 is 0.7. 
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The standard deviation of the negative controls 

is 0.6. 

The threshold of theoretical positivity may be 
calculated according to the formula: 

5 

threshold value = (mean of the OD values of the negative 

controls) + (3 x standard deviation of 
the OD values of the negative controls) 

10 The threshold value is hence equal to 0.7 + (3 x 0.6) = 
2.5; 

The negative results represent a non-specific 
"background" of the presence of antibodies directed 
specifically against an epitope of the peptide. 
15 According to this analysis, and as shown in 

Figure 42 and in the corresponding Table 4, the IgM test 
is specific for MS, since no control has a net OD above 
the threshold. 6 of the 15 MS sera produce a positive IgM 
result 

20 The difference in seroprevalence between the MS 

and control populations is extremely significant: 
"chi-squared" test, p < 0.002. 

These results point to an aetiopathogenic role 
of MSRV-1 in MS. 

25 Thus, the detection of IgM and IgG antibodies 

against the S24Q peptide makes it possible to evaluate, 
alone or in combination with other MSRV-1 peptides, the 
course of an MSRV-1 infection and/or of the viral 
reactivation of MSRV-1. 
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It is possible, as a result of the new 
discoveries made and the new methods developed by the 
inventors, to permit the improved implementation of 
diagnostic tests for MSRV-1 infection and/or reactivation 
and to evaluate a therapy in MS and/or RA on the basis of 
its efficacy in "negativing" the detection of these agents 
in the patient's biological fluids. Furthermore, early 
detection in individuals not yet displaying neurological 
signs of MS or rheumatological signs of RA could make it 
possible to institute a treatment which would be all the 
more effective with respect to the subsequent clinical 
course for the fact that it would precede the lesion stage 
which corresponds to the onset of the clinical disorders. 
Now, at the present time, a diagnosis of MS or RA cannot 
be established before a symptomatology of lesions has set 
in, and hence no treatment is instituted before the 
emergence of a clinical picture suggestive of lesions 
which are already significant. The diagnosis of an MSRV-1 
and/or MSRV-2 infection and/or reactivation in man is 
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hence of decisive importance, and the present invention 
provides the means of doing this. 

It is thus possible, apart from carrying out a 
diagnosis of MSRV-1 infection and/or reactivation, to 
5 evaluate a therapy in MS on the basis of its efficacy in 
"negativing" the detection of these agents in the 
patients 1 biological fluids. 

EXAMPLE 18 : 

10 1) MATERIALS AND METHODS 

- Patients and clinical samples 

Choroid plexus cells from MS patients and 
controls were obtained from the brain-cell library, 
Laboratoire R. Escourolles, Hopital de la Salpetriere, 

15 Paris, France. Non-tumoral leptomeningeal cells from 
controls were obtained as previously described (2 6) . 
Peripheral blood from MS and control patients used for 
obtaining B-cell lines and plasma, were obtained from the 
Neurological Departments, CHU de Grenoble, and from 

20 INSERM U 134, Hopital de la Salpetriere, France. Clinical 
details and origin of the 10 MS patients and of the 10 
patients with other neurological diseases who provided CSF 
samples are given in Table 6. 

- Cell cultures, virus isolation and purification 

25 All cell-types were cultured as previously 

described (3, 5, 26). 

All cultures were regularly screened for mycoplasma 
contamination with an ELISA mycoplasma-detection kit 
(Boehringer) . No cell-extract nor supernatant used 

30 contained detectable mycoplasma. 

Extracellular virion purification and sucrose density 
gradients were performed as previously described (3, 5, 
26). From each sucrose gradient 0.5-lml fractions were 
collected from the top of the tubes, with a 1000m1 

35 Pipetman and a different sterile tip for each fraction. 
60/il were used for RT activity assay and the rest was 
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mixed with 1 volume of buffer containing 4M guanidinium 
thiocyanate, 0.5% N-Lauroyl sarcosin, 25mM EDTA, 0.2% B- 
mercaptoethanol adjusted at pH 5,5 with acetic acid. These 
mixtures were frozen at -80°C for futher RNA extraction 
5 or directly processed according to Chomzynski (20) , with 
an overnight precipitation step at -20 °C, in presence of 
RNase-free glycogen (Boehringer) . RNA was dissolved 20 to 
50^1 of DEPC-treated water in the presence of 1-2/Ltl of 
recombinant RNase-inhibitor (PROMEGA) and 0 , ImM DTT . 10/xl 

10 aliquots were used for each RT-PCR. 
- Reverse transcriptase activity 

RT-activity was tested with 2 0mM Mg*" 1 " and poly- 
Cm or polyC templates, in virion pellets or fractions from 
sucrose gradients as previously described (3, 5, 26). 

15 - cDNA synthesis and 'Pan-retro 1 RT-PCR with degenerate 
primers 

A total RT-activity between 10 6 -10 7 dpm was 
required in the fraction containing the peak of purified 
virions. The "Pan-retro" RT-PCR technique (27) was 

20 performed on virion RNA extracted by the method of 
Chomczynski (20) and dissolved in 20 |xl RNase-free water. 
5 fjil RNA solution was incubated for 30 min at 37 °C with 
0.3 units (3 units for CSF series) of RNase-free DNase-1 
(Boehringer) in a 20 til reaction containing 7.5 mM random 

25 hexamers, 5 mM Hepes-HCl pH 6.9, 75 mM KC1, 3 mM MgCl2, 10 
mM DTT, 50 mM Tris-HCl pH 7.5, 0.5 mM each dNTP, and 20 
units recombinant RNase inhibitor (Promega) . The DNase was 
then heat inactivated at 80 °C for 10 min. 20 units MoMLV 
RT (Pharmacia) and a further 20 units of RNase inhibitor 

TM 

3 0 were added to each tube in a Genesphere enclosure 
(Safetech, Ireland) and cDNA was synthesised for 90 min at 
37°C. Following reverse transcription, the cDNA was boiled 
for 5 min then cooled rapidly on ice. The Round 1 PCR mix 
(final volume 25 fxl per reaction; 20 mM Tris-HCl pH 8.4, 

35 60 mM KC1, 2.5 mM MgCl2, 200 ng each of primers PAN-UO and 
PAN-DI [see Figure 44], 0.2 mM each dNTP) was treated with 
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0.3 units DNase-l and then heat inactivated as above. 

TM 

2.5 pi cDNA was added in the Genesphere enclosure and the 
tubes heated to 80°C before adding 0,5 units Tag 
polymerase (Perkin Elmer) individually to each tube ( M hot 
5 start") . Round 1 PCR parameters were 35 cycles of 95°C for 
1 min, 34°C for 30 sec, 72°C for 1 min, with a final 7 min 
extension at 72 °C. 0*5 pi of Round 1 PCR product was 
transferred to the Round 2 DNase-treated PCR mix 
(composition as for Round 1 but containing primers PAN-UI 
10 and PAN-DI) using the "hot start" procedure. Round 2 PCR 
parameters were as for Round 1 but using 30 cycles only 
and annealing at 45°C for 1 min, 

- Cloning of PCR products 

PCR products were cloned using the TA-cloning® _ 
15 kit (British Biotechnology) according to the 

manufacturer ' s recommendations . 

- Sequencing 

Sequencing reactions were performed using the 
"Prism ready reaction kit dye deoxyterminator cycle 
20 sequencing kit" (Applied Biosystems) . Automatic sequence - 
analysis was performed on an automatic sequencer (Applied 
Biosystems , 373 A) . 

- RT-PCR with ST1 primer sets 

The first PCR round was performed directly from the 
2 5 cDNA reaction mixture according to the one-step RT-PCR 
technique described by Mallet et al. (28). This one-step 
RT-PCR procedure reduced the probability of airborne 
contamination when opening the tubes and transferring PCR 
reagents after an independent cDNA synthesis. RNA was 
30 extracted as previously from 2ml of plasma (snap-frozen in 
liquid nitrogen and stored at -80°C) or from a 500 /il 
sucrose fraction with a total RT-activity above 10 6 dpm, 
and resuspended in 50 ill of RNase-free water. For each RT- 
PCR reaction 10/xl of RNA solution was incubated in a 
35 Perkin-Elmer 480 thermocycler , 15 min at 20°C with 1U of 
RNase-free DNASE 1 and 1.2 Ml of 10X DNASE buffer (50mM 
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Tris, lOmM MgC12 and 0 , ImM DTT) containing 1U/m1 of RNase- 
inhibitor (PROMEGA) , and heated at 70°C for 10 min for 
DNase inactivation . The solution was placed on ice and 
mixed (in conditions preventing airborne dust/DNA 
5 contamination) with 88 til of PCR mix containing: IX taq 
buffer, 25 nM/tube dNTPs, 4 0pM/tube of each first round 
primer (ST1.1 upstream primer: 

5» AGGAGTAAGGAAACCCAACGGAC 3' (SEQ ID NO: 99); ST1.1 
downstream primer: 5 1 TAAGAGTTGCACAAGTGCG 3' (SEQ ID 

10 NO:100)), 2.5U/tube of taq (Appligene) and lOU/tube of 
AMV-RT (Boehringer) . Each tube iwas further incubated in a 
Perkin-Elmer 480 thermocycler for 10 min at 65 °C, followed 
by 2h at 42 °C for cDNA synthesis and 5 min at 95°C for 
inactivation of AMV-RT and DNA denaturation . First round 

15 parameters were 40 cycles of 95°C for 1 min, 53 °C for 2 . 5 
min, 72°C for 1 min, with a final extension of 10 min at 
72°C. 10/il of the first round were transferred to the 
second round PCR mix previously treated at 20°C for 15 min 
with RNase-free DNase 1 (0.02U//il) followed by DNase 

20 inactivation at 70°C for 10 min. This mix contained IX taq 
buffer, 25 nM/tube dNTPs, 40pM/tube of each second round 
primers [ ST1 • 2 upstream primer : 5 ■ TCAGGGATAGCCCCCATCTAT3 ' 
(SEQ ID NO:101); ST1 . 2 downstream primer: 

5 1 AACCCTTTGCCACTACATCAATTT3 1 (SEQ ID NO: 102)] and 

25 2.5U/tube of taq (Appligene). Second round parameters 
were 30 cycles of 95°C for 1 min, 53°C for 1.5 min, 72°C 
for 1 min, with a final extension of 8 min at 72°C. 20/nl 
of this nested RT-PCR product were deposited on a 0,7% 
agarose gel containing ethidium bromide and exposed to UV 

30 light for the visualization of amplified products. 

- Hybridisation analysis of PCR products: MSRV-pol 
detection by ELOSA 

The protocol was essentially as previously 
described (21) but with the following modifications: Nunc 

35 Maxisorb microtitre plates were coated with 100 ng per 
well capture probe CpVlb (see Figure 44) either by passive 
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adsorption (21) or alternatively by using streptavidin 
coated plates and biotinylated CpVlb. Peroxidase-labelled 
detector probe DpVl (see Figure 44) was used and the assay 
cut-off was defined as the mean of 4 negative controls 
5 plus 0.2 OD492 units. 

- RNA extraction, cDNA synthesis and PCR amplification 
from MS plasma samples : 

Total RNA was extracted from human MS plasma by 
a guanidium method as described elsewhere (29) . Total RNA 

10 extracted from 100 ul of plasma, were treated with RNase- 
free DNase I (O.IU/mI; Boehringer Manheim, France) and 
reverse transcribed under the conditions recommended by 
the manufacturer, using Superscript reverse transcriptase 
(Gibco-BRL, FRANCE) . The resulting cDNAs were amplified by 

15 semi-nested PCR through 35 cycles (94 °C 1 min, 55°C 1 mn, 
72°C 1 min 30 sec) and 72°C 8 min for a final extension. 
Three different fragments in the RT region were amplified 
by the following specific primers : 

- in the protease (PRT) region, for the 1st and 
2 0 2nd round of PCR, respectively, sense primer 

[5' TCC AGC AGC AGG ACT GAG GGT 3' (SEQ ID NO:103)] and 
antisense primers [5 1 CTG TCC GTT GGG TTT CCT TAC TCC T 3' 
(SEQ ID NO: 104) / 5 1 GAC AGC AAA TGG GTA TTC CTT TCC 3» 
(SEQ ID NO: 105) ] 

25 - in the fragment A of the RT region (Cf. Fig 

46) , for the 1st and 2nd round of PCR, respectively, sense 
primer [5 1 AGG AGT AAG GAA ACC CAA CGG ACA G 3' (SEQ ID 
NO:106)] and antisense primers [5' TGT ATA TAA TGG TCT GGC 
TAT TGG G 3 1 (SEQ ID NO: 107) / 5 ! TTC GGC AGA AAC CTG TTA 

30 TGC CAA GG 3' (SEQ ID NO: 108)] 

- in the fragment B of the RT region (Cf. Fig. 
46) , for the 1st and 2nd round of PCR, respectively, sense 
primers [5* GGC TCT GCT CAC AGG AGA TTA GAT AC 3 1 (SEQ ID 
NO: 109) / 5* AAA GGC ACC AGG GCC CTC AGT GAG GA 3' (SEQ ID 

35 NO: 110)] and antisense primer 3' [5 1 GGT TTA AGA GTT GCA 
CAA GTG CGC AGT C 3' (SEQ ID NO: 101)]. 
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The amplified fragments were analysed on 
ethidium bromide-stained agarose gels, cloned in TA 
cloning vector (Invitrogen) and sequenced. 
2 ) RESULTS 

5 - Specific retroviral RNA is found in extracellular 
virions from MS patient-derived cell cultures and in MS 
patients 1 CSF. 

Choroid plexus cells (4) (obtained post-mortem) 
and EBV-immortalized peripheral blood B-lymphocytes (30, 
10 31) from MS patients gave rise to cultures expressing 100- 
120 nm viral particles associated with RT-activity similar 
to that of the original LM7 isolate (3). Similar cell- 
types from non-MS donors produced neither this RT-activity 
nor virions. All the 'infected 1 cultures were poorly 
15 and/or transiently productive and/or had a limited 
lifespan. Therefore, in order to analyse the genomic RNA 
present in the very limited quantity of extracellular 
virions, we used an RT-PCR approach to amplify, with 
degenerate primers, a conserved region of the pol gene 
20 present in all known retroviruses (12); the techniques 
based on this approach will be called "Pan-retro" RT-PCR, 
Extensive DNAse treatment of samples and reagents was 
essential, because human DNA contains many endogenous 
retroviral elements amplifiable by this technique. 
2 5 "Pan-retro" RT-PCR experiments were performed on sucrose- 
density gradient purified virions from supernatants of 
different types of cell cultures and their non-infected 
controls: (i) choroid plexus cells sampled post-mortem 
from MS brain (PLI-1) , (ii) choroid plexus cells from non- 
30 MS brain autopsy, infected by co-culture with irradiated 
LM7 cells (LM7P) , and (iii) identical non-infected 
choroid-plexus cells, "Early" B-cell lines obtained by 
spontaneous in vitro transformation of two EBV- 
seropositive individuals, (iv) one MS patient and (v) one 
35 non-MS control, were also analysed. Figure 4 3 illustrates 
the RT-activity in sucrose-gradient fractions obtained 
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from the B-cell cultures. The technique described by Shih 
et al. (12) was modified in a semi-nested RT-PCR protocol 
(27) using degenerate primers (Fig. 2) and extensive DNase 
treatment. PCR amplifications were performed in London 
5 (Dpt of Virology, U.C.L.M.S.) on coded aliquots of the 
density gradient fractions. Blind and systematic cloning 
and sequencing of the PCR products were undertaken in an 
independent laboratory (bioMerieux, Lyon) . After complete 
sequencing of 20 to 30 clones per sucrose gradient 
10 fraction, the codes were broken and results analysed in 
parallel with the RT-activity data. 

Table 5 presents the distribution of sequences obtained 
from sucrose gradient fractions containing the peak of 
viral RT-activity in MS-derived cultures and also the 

15 sequences amplified from the corresponding RT-activity 
negative fractions of uninfected cultures. The predominant 
sequence detected in bands of the expected size (Z.140 bp) 
amplified in all the RT-activity positive fractions (but 
not in the RT-activity negative fractions) was different 

2 0 from known retroviruses and was designated MSRV-cpol. 
MSRV-cpol sequences exhibited partial homology (70-75%) 
with ERV9 , a previously described endogenous retroviral 
sequence (18). A few ERV9 sequences (>90% homology with 
ERV9) were also present but clearly represented a minority 

25 of clones. In addition to typical pol sequences, numerous 
PCR artefacts (primer multimers, concatemers or single- 
primer amplifications) related to the use of degenerate 
primers and low-temperature annealing, were found in all 
samples (Table 5) . 

30 Figure 44 shows an alignment of a consensus sequence of 
MSRV-cpol with the corresponding VLPQG / YMDD region of 
diverse retroviruses. Figure 45 displays a phylogenic tree 
based on the evolutionarily conserved amino acid sequences 
of both exogenous and endogenous retroviruses in this 

35 region. From this tree it can be seen that the pol gene of 
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MSRV is phylogenically related to the C-type group of 
oncovir inae . 

A small scale study was performed to determine the 
prevalence of MSRV c-pol sequences in the CSF of patients 
5 with MS. Identification of MSRV-cpol in PCR products by 
cloning and sequencing is both laborious and time 
consuming. We therefore devised an enzyme-linked 
oligosorbent assay (ELOSA) , using a capture probe (CpVlB) 
and a peroxidase-labelled detector probe (DpVl) , for the 

10 rapid identification of MSRV-cpol sequences in "Pan- 
retrovirus 1 PCR products (Figure 44). The specificity of 
this sandwich hybridisation-based assay for HMSRV-cpol was 
tested with both distantly related (HIV and MoMLV) and 
closely related (ERV9) pol sequences. No significant cross 

15 reactivity with such targets was observed despite the 
ability of the ELOSA to detect as little as 0.01 ng of 
MSRV-cpol DNA. 

Cerebrospinal fluid (CSF) samples were available from 10 
patients with MS and from 10 patients with other 

20 neurological disorders. Total RNA was extracted from CSF 
pellets, reverse transcribed and amplified as above. ELOSA 
analysis (Table 6) of the PCR products revealed MSRV-cpol 
sequences in 5 of the 10 MS patient samples but in none of 
the 10 samples from patients with other neurological 

25 diseases (P<0.05). The presence of MSRV-cpol did not 
appear to be correlated with age, sex or type of MS, but 
was seen in untreated patients only (5/6) . No patient with 
immunosuppressive therapy was found positive (0/4) . No 
correlation between MSRV-cpol detection and CSF cell count 

3 0 was observed. 

- Cloning and sequencing a larger region of the pol gene 

An independent identification of the MSRV 
genomic sequence was obtained by a non-PCR approach using 
RNA extracted from concentrated virions derived from 2,5 

35 liters of LM7-infected sub-cultures of choroid plexus 
cells. A limited number of clones was obtained by direct 
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cloning of the cDNA , one of which (PSJ17 ) showed partial 
homology with ERV9 pol. Specific primers based on the 
MSRV-cpol region and on the PSJ17 clone, amplified a 740 
bp fragment linking the two independent sequences in RNA 
5 extracted from purified virions. PSJ17 was localised on 
the 3 1 side of MSRV-cpol . Further sequence extension on 
the 5' side of MSRV-cpol and on the 3' side of PSJ17 , was 
obtained using RT-PCR approaches on RNA from purified LM7- 
like virions produced in MS choroid plexus cultures (4) . 

10 In Figure 46, the nucleotide sequence 

corresponding to overlapping clones obtained by sequence 
extension in the pol gene is represented with the 
aminoacid translation corresponding to the putative open 
reading frames (ORFs) of the protease and of the reverse- 

15 transcriptase. The active site motifs of the protease 
(PRT) and of the reverse-transcriptase (RT) are 
underlined. In the C-terminal region of the RT sequence, 
the dispersed amino acid residues regularly present in 
retroviral RNase H domains, are also underlined. 

20 - Non-degenerate primers detect MSRV-specif ic RNA in 
virions associated with the peak of RT-activity . and in 
in MS patients' plasma 

PCR primers (ST1.1 primer set; positions 603-625/1732- 
1714, on Fig. 4) based on overlapping clones in the pol 

25 gene, amplified a 1.15 kb segment of the RT region from 
several different isolates obtained from different MS 
patients. Nested primers (ST1.2; positions 869-889/1513- 
1490, on Fig. 46) generated a 700 bp fragment (Figure 47) 
which was more easily visualised by ethidium bromide 

3 0 staining than the first round product generated by ST1.1. 
The specificity of PCR products was confirmed by stringent 
hybridisation with a peroxidase-labeled MSRV-cpol probe 
(Fig. 44), using the ELOSA technique (21). 

The ST1.1 and 2 primer set was used to detect 
35 extracellular MSRV RNA in human plasma, although non- 
optimal for this application. Figure 47 illustrates the 
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results of PCR amplification of cDNA derived from 2 MS 
patient and 2 control plasma samples tested in parallel 
with cDNA from the sucrose density gradient fractions of 
an MS choroid plexus isolate. Taq-sequencing of the 700 bp 
5 bands confirmed the presence of MSRV sequence . A very 
faint 700 bp band is also visible in fraction 10 which 
corresponds to the bottom of the tube where aggregated 
particles usually sediment. Control RT-PCR for cellular 
aldolase transcripts on plasma-derived RNA was negative, 

10 indicating that the results were not due to cellular RNA 
released by cell lysis during plasma separation. It should 
be noted that this PCR technique was not designed for 
epidemiological studies since its sensitivity is impaired 
by the length of the cDNA required (1.15 kb) . 

15 Non degenerate primers amplifying three 

fragments of the pol gene (the whole protease region, 
regions A and B of the reverse transcriptase; Cf. Fig. 46) 
were also used to confirm the presence of MSRV sequences 
in DNase-treated RNA from MS plasma. These fragments were 

2 0 amplified from the plasma of a further 4 MS patients with 
active disease. Sequence analysis confirmed that the PRT 
and RT regions were homologous (>95% and >90% 
respectively) to MSRV sequences previously obtained on 
culture virion. No such sequence were detected in plasma 

25 from healthy controls (n=4), tested in parallel with MS 
plasma . 
3) DISCUSSION 
- Phylogeny of MSRV 



3 0 concluded that the virus previously referred to as "LM7" 
(3, 5, 26) posseses an RNA genome containing the MSRV pol 
sequences described here. 

The conserved RT motif of both MSRV and ERV9 is two amino 
acids shorter than that of other retroviruses, apart from 
3 5 human foamy viruses which nonetheless have a functional 
RT. The potential ORF encompassing the entire PRT-RT 



From the results of this study, it can be 
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region is consistent with the virion-associated RT- 



infected culture supernatants. Moreover, since we have 
recently succeeded in expressing a recombinant protein 
5 from the sequence of MSRV protease cloned from MS plasma, 
we can confirm the reality of the potential PRT ORF . 
Similar cloning and expression of other sequences 
containing potential ORFs for MSRV proteins, is being 
undertaken to confirm their ability to encode enzymes and 

10 structural proteins of MSRV virions. 

The phylogenic tree in Figure 45, based on the most 
conserved amino acid sequence in retroviruses 
(VLPQG . . . YXDD) , shows that the MSRV pol gene is related to 
the C-type oncoviruses. Apart from ERV9 , the closest known 

15 retroviral element is RTLV-H , a human endogenous sequence 
known to have a subtype with a functional pol gene (32) . 
In the pol region, this phylogenic affiliation to C-type 
oncoviruses apparently contradicts our previous 
assumptions based on the general morphology of the 

2 0 particles observed by electron microscopy (EM) , which were 
compatible with a B or D-type oncovirus (3, 5, 26). 
However, preliminary data on env sequences detected in 
MSRV virions, would suggest a greater phylogenic proximity 
to D-type. Such difference in phylogenies of the pol and 

2 5 env genes have been described in MPMV and suggest a 
recombinatorial origin in D-type retroviruses (33) . D to C 
type morphological conversion is also possible since it 
has been reported that a single amino acid substitution in 
the gag protein can convert retrovirus morphology to that 

30 of a different type (34). 

- Is MSRV an exogenous retrovirus sharing extensive 
homology with a related endogenous retrovirus family or an 
endogenous retrovirus producing extracellular virions? 



35 under stringent conditions, showed hybridisation with a 
multicopy endogenous family (data not presented) , 



activity detected 



in sucrose density gradients with 



Southern blot analysis with an MSRV pol probe 
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indicating the existence of endogenous elements more 
closely related to MSRV than ERV9 itself. Consequently, we 
were unable to look for a virion-specif ic provirus in 
MSRV-producing cells. In agreement with southern blot 
5 findings, PCR studies on genomic DNA showed multiple band 
amplification of MSRV-related endogenous sequences. Since 
pol is the most conserved retroviral gene, the sequence 
described here is the least suitable region to 
discriminate between exogenous and endogenous sequences. 

10 It is hoped that sequence information from other parts of 
the genome may permit such a discrimination, would it be 
on a tiny portion as has recently been demonstrated for 
the Jaagsiekte retrovirus (JSRV) of sheep (35) . With such 
sequence data, it would then become possible to identify 

15 the MSRV-specif ic provirus in the genome of virion- 
producing cell cultures. 

MSRV could represent a virion-producing exogenous member 
of an ERV9-like endogenous family, just as exogenous 
strains exist in the well-studied mouse mammary tumour 

20 virus (MMTV) and murine leukaemia virus (MuLV) retroviral 
families of mice, and also, in the JSRV retroviral family 
of sheep (36). Alternatively, it is also conceivable that 
the extracellular MSRV virions may be produced by a 
replication-competent endogenous provirus. Wether MSRV is 

25 exogenous or endogenous, conceptual similarities exist 
with the category of retroviruses represented by MuLV, 
MMTV and JSRV. Unlike defective endogenous elements, this 
category of agents are known to produce infectious and 
pathogenic virions, to cause neurological disease (37) , 

3 0 solid tumours / leukaemias (36, 38) and to express 
"endogenous superantigens" (39, 40). Furthermore, in MuLV 
infections, the genetic endogenous retroviral background 
of the mouse strain can determine susceptibility or 
resistance to disease (39, 41). Indeed, such interactions 

3 5 between an infectious retrovirus and its endogenous 
counterpart may be relevant in the pathogenesis of MS, 
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since endogenous retroviral genotypes are not identical in 
all individuals. A genetic control due to related 
endogenous retroviral genotypes could therefore contribute 
to the known hereditary susceptibility to MS (43) , if MSRV 
5 does indeed play an active role in this disease. 

Elsewhere, the data in Table 5 suggest that ERV9 elements 
may be co-expressed, possibly via trans-activation in 
infected cells, and give rise to heterologous RNA 
packaging in MSRV virions. Such heterologous packaging is 
10 known to occur in other retroviral systems (42) . 

- A role for the numerous common viruses previously evoked 
in MS ? 

Among the numerous reports of viruses putatively 
involved in the aet iopathogenesis of MS , a significant 

15 proportion focus on two viral families, the 
paramyxoviridae and the herpesviridae . Regarding the 
paramyxoviridae, the key observation is of a frequently 
increased antibody titer to measles virus in MS patients 
essentially directed, in CSF, against measles fusion 

20 protein (44) . The existence of aminoacid similarities 
between conserved domains of the fusion proteins of 
paramyxoviridae and the transmembrane protein of 
retroviruses (45) , may explain this observation if 
antigenic cross-reactivity between these two proteins 

25 occur ed. 

With regard to the herpesvirus family, the involvement of 
Epstein-Barr Virus (EBV) , Herpes Simplex Virus type 1 
(HSV-1) and, most recently, Human Herpes Virus 6 (HHV-6) 
has been proposed (31, 46, 47). From our previous studies 

3 0 and from those of other groups, it appears that 
herpesviruses may play an important role in MSRV 
expression: we have shown that HSV-1 immediate-early ICPO 
and ICP4 proteins can transactivate MSRV/LM7 in vitro (6) 
and Haahr et al. have proposed an important 

35 epidemiological role for EBV, as a co-factor in MS, 
triggering retrovirus reactivation (31) . The recent 
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description by Challoner et al. (47) showing significant 
expression of HHV6 proteins in MS plaques may also suggest 
a similar role for HHV6 in the brain. 

5 EXAMPLE 19 : MSRV GENOME DETECTION TECHNIQUE 

Following 0.4 filtration to remove cellular 

debris and RNase digestion to remove residual non- 
encapsidated RNA, serum was processed to extract viral RNA 
by means of adsorption to a silica matrix. Viral RNA was 

10 subjected to DNase digestion, then a combined reverse 
transcription-PCR (RT-PCR) reaction was performed using 
primers PTpol-A (sense: 5'xxxx3', SEQ ID NO: 18 3) and 
PTpol-F (antisense: 5 t xxxx3 i , SEQ ID NO: 184). A second 
round of amplification with nested primers PTpol-B (sense: 

15 5'xxxx3 1 , SEQ ID NO: 185) and PTpol-E (antisense: 5'xxxx3', 
SEQ ID NO: 186) generated a 435 bp PCR product which was 
identified by gel electrophoresis. The specificity of each 
product was confirmed by dideoxy sequencing. Control 
reactions without reverse transcriptase were performed to 

2 0 ensure that the products were derived from viral RNA. In 
addition, to exclude the possibility that the extracted 
viral RNA might be contaminated with host cell derived 
nucleic acids, aliquots were tested by nested PCR for the 
presence of pyruvate dehydrogenase (PDH) DNA and RNA. 

25 Samples which generated a signal in either the PDH or the 
"no-RT" PCR assays were excluded from the analysis. 

Sera from patients with clinically active MS and 
controls were amplified by RT-PCR and sequenced. Virion 
associated MSRV-RNA was detected in the serum of 10 of 19 

30 (53%) patients with MS but in only 3 of 44 controls 
without MS (P=0.0001). The control group consisted of 8 
patients (all MSRV-RNA negative) with rheumatological 
disorders and 3 6 healthy adults. MSRV-RNA titres in both 
MS patients and controls were apparently low because even 

35 moderate dilution of sera (<10 fold) caused loss of 
signal • 
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In MS patients, detection of MSRV-RNA was not 
associated with age, sex, disease duration, or MS type, 
however a significant negative correlation with treatment 
was observed. 26 serum samples were obtained from the 19 
5 patients ; 100% of the sera from untreated patients 
contained detectable MSRV-RNA whereas it was detectable in 
only 4 of 19 samples (21%) obtained during treatment with 
corticosteroids and/or azathioprine (P=0.001). 

The reason for the apparent loss of virion 
10 associated MSRV-RNA during immunosupressive treatment is 
unknown but the finding is in agreement with the previous 
observations on the detection of MSRV in cerebrospinal 
fluid. 

15 TABLE 7 

DETECTION OF VIRION ASSOCIATED MSRV-RNA IN MS UNTREATED 

PATIENTS & CONTROLS 





Positive 


Negative 


Total 


% Positive 


Controls without MS a 


3 b 


41 


44 


7% 












MS Bera untreated at 
time of sampling 


7 


0 


7 


100% 



20 a The control group consisted of 8 patients with 
miscellaneous non-MS disorders and 3 6 healthy adults. 
b The detection of MSRV RNA in plasma of a few controls in 
conditions which select vir ion-packaged RNA, is consistent 
with the knowledge that a virus associated with MS should 

25 be present in a minor proportion of apparently healthy 
population. Indeed, such individuals can be either healthy 
carriers or be in the pre-clinical (or sub-clinical) phase 
of the disease which can last for years. 



30 
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METHOD : 

- Modified SNAP RNA extraction with filtration and RNase 
digestion 

(All centrif ugations are at room temperature) 
5 Up to 500 microlitres of serum is filtered using 

0*45 micron spin filters (Nanosep MF from Flowgen 
Catalogue No. U3-0126 Ref. 0DM45) . The serum is spun for 
5 min at 130,000 g (or for further 10 min if necessary). 

150 microlitres of filtered serum is incubated 
10 with 10 units RNase One (Promega Catalogue No.M4261) for 
30 min at 37°C. 

The 150 microlitres was then extracted using the 
SNAP RNA extraction kit (Invitrogen) as below: 

- 10 micrograms of poly A RNA was added to the. 
15 450 microlitres of Binding Buffer to act as a carrier ; 
this was then added to the serum and mixed by inversion 6 
times ; 300 microlitres of propan-2-ol was then added and 
mixed by inversion 10 times ; 500 microlitres was 
transferred to the SNAP column and spun at 13 00 g for 

2 0 1 min and the flow-through discarded ; the remainder was 

then added to the SNAP column and spun at 1300 g for 1 min 
and the flow-through discarded ; the column was then 
washed with 600 microlitres of Super wash and the flow- 
through discarded ; the column was then washed with 600 

25 microlitres of lx RNA wash and the flow-through 
discarded ; this wash was repeated with a 2 min 1300 g 
spin and the flow-through discarded ; the bound nucleic 
acid was then eluted by incubating with 135 microlitres of 
RNase free water for 5 min and spun at 1300 g for 1 min. 

30 - 15 microlitres of lOx DNAse buffer and 3 

microlitres (30 units) of DNase I, RNase free (Boehringer 
Mannheim Cat. No. 776 785) was added and incubated for 30 
min at 37 °C ; 450 microlitres of Binding Buffer was added 
and mixed by inversion 6 times ; 300 microlitres of 

3 5 propan-2-ol was then added and mixed by inversion 10 

times ; 500 microlitres was transferred to the SNAP column 
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and spun at 13 00 g for 1 min and the flow-through 
discarded ; the remainder was then added to the SNAP 
column and spun at 1300 g for 1 min and the flow-through 
discarded ; the column was then washed with 600 
5 microlitres lx RNA wash and the flow-through discarded ; 
this wash was repeated with a 2 min 1300 g spin and the 
flow-through discarded ; the bound nucleic acid was then 
eluted by incubating with 105 microlitres of RNase free 
water for 5 min and spun at 1300 g for 1 min. 

10 

- Titan RT-PCR 

RT-PCR was performed using the Titan one tube RT- 
PCR system (Boehringer Mannheim Cat. No. 1 855 476) 25 
microlitres of RNA was used in the combined RT-PCR 

15 reaction. The total reaction volume was 50 microlitres. 
Promega rRNAsin (10 units) was the RNase inhibitor used. 
170 ng of primers SEQ ID NO: 183 and SEQ ID NO: 184, 
respectively, were used. A single master mix was prepared 
and the sample RNA added last. This was performed at room 

2 0 temperature, not on ice. 

The RT step consisted of two sequential 30 min 
incubations at 50°C and then 60°C. This was immediately 
followed by the PCR which had the following steps. 

* Initial denaturation of template at 94 °C for 2 min, 

25 * 40 cycles of 94 °C for 30 seconds ; 60°C for 30 seconds ; 
68 °C for 45 seconds, 

* 1 cycle of 68°C for 7 min. 

The second round PCR was performed using the 
Expand long template PCR system (Boehringer Mannheim Cat. 
30 No. 1681 842). 0.5 microlitres of the RT-PCR mix was added 
to 25 microlitres of the round 2 PCR mix. Buffer No. 3 and 
50 ng of primers B and E were used. The PCR had the 
following steps: 

* 5 cycles of 94°C for 30 seconds, 60°C for 30 seconds., 
35 68°C for 45 seconds, 

* 1 cycle of 68°C for 7 min. 
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The PCR products were then run on a 2% agarose 

gel • 

The no RT controls were performed using "Expand" 
PCR system for both rounds. The first round was 40 cycles 
5 and the second round 20 cycles. 

As a positive control a DNA dilution series was 
used in both the RT-PCR and the "no RT" PCR. For a result 
to be valid the RT-PCR and "no-RT" PCRs had to have 
detected DNA equivalent to between 1 and 0.1 cells. 
10 The analysis of PCR products of an approximately 

435 bp fragment in the pol region is shown in Table 8. 

TABLE 8 

ANALYSIS OF PCR PRODUCTS WITH ORF * 

15 



Exp 


Disease 


Clone 


ORF 


Fragment (bp) 


AA-RT Motif Site 


46-7 


MS 


1 


+ 


429 


YGDD 






5 


+ 


429 


YGDD 






8 


+ 


429 


YGDD 


68-1 


MS 


41 


+ 


438 


YMDD 






42 


+ 


438 


YMDD 






43 


+ 


438 


YMDD 



25 * Defective RNA can also be present in circulating 
virions, since the fidelity of the MSRV reverse 
transcriptase appears to be low and since recombination 
events with related endogenous elements can occur. It is 
then obvious that the intra- and inter- patients 

3 0 variability can be greater than that illustrated in this 
example, because of these encapsidated defective MSRV RNA 
copies. 

Table 9 which data have been determined from the 
35 alignments of Figures 49 to 53, shows a variability : 
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- between the clones obtained from the same patient plasma 
sample in the same PCR amplification experiment ; this 
means that the patient possesses a virion population which 
comprises different MSRV variants at a given time, 
5 - between the sequenced variant populations from different 
patients ; this means that the variants differ from a 
patient to another patient . 

TABLE 9 

10 Degree of identity (percentage) between nucleotide 

sequences and between peptide sequences , 
by direct comparison of said sequences (see Figures 49-53) 



Patient 


68-1 


46-7 


Nucleotide 
sequences 


between SEQ ID NO: 169 

and MSRV-pol (SEQ ID NO:l) 

90,4 % b 


between SEQ ID NO: 176 

and MSRV-pol (SEQ ID NO:l) 

82,5 % a 




92,3 % a 


84 % b 




SEQ ID NOs:170, 171, 
172 between them 


SEQ ID NOs:177, 178, 
179 between them 




98,6 % b 


94, 5 % a 




98,7 % a 


95,1 % b 


Peptide 
sequences 


between SEQ ID NOs:173, 
174, 175 and SEQ ID NO i 
81 % 


between SEQ ID NOs:180, 
181, 182 and SEQ ID NO: 
73,5 % 




SEQ ID NOs:173, 174, 175 
between them 

97 % 


SEQ ID NOs:180, 181, 182 
between them 

89 % 



15 a) this percentage is determined on the basis of sequences 
excluding the primers 

b) this percentage is determined on the basis of sequences 
including the primers. 

20 From Figures 53A and 53B, the variability between tested 
patients sequences can be determined : 
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- between SEQ ID NO: 169 and SEQ ID NO: 176 : 16,5 % a and 
14,8 % b 

- between the peptide sequences obtained from 
SEQ ID NO: 169 and SEQ ID NO: 176 : 20 %. 

Four microorganisms are mentioned in the 
specification page 3 lines 15-26 and they are identified 
below. They have all been deposited with the ECACC* , in 
accordance with the provisions of the Budapest Treaty. 



- LM7PC deposited on 22nd July 1992 under No. 92072201, 

- PLI-2 deposited on 8th January 1993 under No. 93010817, 

- POL-2 deposited on 22nd July 1992 under No. V92072202, 
and 

15 - MS7PG deposited on 8th January 1993 under No. V93010816. 

* ECACC : European Collection of Animal Cell Cultures 
Vaccine Research and Production Laboratory 
Public Health Laboratory Service 
20 Centre of Applied Microbiology and Research 

Porton Down 

Salisbury, Wiltshire SP4 OJG 
United Kingdom 



25 
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10 



15 



20 



25 
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(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 4 72 69 84 30 

(B) TELEFAX: 4 72 69 84 31 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1158 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



CCCTTTGCCA 


CTACATCAAT 


TTTAGGAGTA 


AGGAAACCCA 


ACGGACAGTG 


GAGGTTAGTG 


60 


CAAGAACTCA 


GGATTATCAA 


TGAGGCTGTT 


GTTCCTCTAT 


ACCCAGCTGT 


ACCTAACCCT 


120 


TATACAGTGC 


TTTCCCAAAT 


ACCAGAGGAA 


GCAGAGTGGT 


TTACAGTCCT 


GGACCTTAAG 


180 


GATGCCTTTT 


TCTGCATCCC 


TGTACGTCCT 


GACTCTCAAT 


TCTTGTTTGC 


CTTTGAAGAT 


240 


CCTTTGAACC 


CAACGTCTCA 


ACTCACCTGG 


ACTGTTTTAC 


CCCAAGGGTT 


CAGGGATAGC 


300 


CCCCATCTAT 


TTGGCCAGGC 


ATTAGCCCAA 


GACTTGAGTC 


AATTCTCATA 


CCTGGACACT 


360 


CTTGTCCTTC 


AGTACATGGA 


TGATTTACTT 


TTAGTCGCCC 


GTTCAGAAAC 


CTTGTGCCAT 


420 


CAAGCCACCC 


AAGAACTCTT 


AACTTTCCTC 


ACTACCTGTG 


GCTACAAGGT 


TTCCAAACCA 


480 


AAGGCTCGGC 


TCTGCTCACA 


GGAGATTAGA 


TACTNAGGGC 


TAAAATTATC 


CAAAGGCACC 


540 


AGGGCCCTCA 


GTGAGGAACG 


TATCCAGCCT 


ATACTGGCTT 


ATCCTCATCC 


CAAAACCCTA 


600 


AAGCAACTAA 


GAGGGTTCCT 


TGGCATAACA 


GGTTTCTGCC 


GAAAACAGAT 


TCCCAGGTAC 


660 


ASCCCAATAG 


CCAGACCATT 


ATATACACTA 


ATTANGGAAA 


CTCAGAAAGC 


CAATACCTAT 


720 


TTAGTAAGAT 


GGACACCTAC 


AGAAGTGGCT 


TTCCAGGCCC 


TAAAGAAGGC 


CCTAACCCAA 


780 


GCCCCAGTGT 


TCAGCTTGCC 


AACAGGGCAA 


GATTTTTCTT 


TATATGCCAC 


AGAAAAAACA 


840 


GGAATAGCTC 


TAGGAGTCCT 


TACGCAGGTC 


TCAGGGATGA 


GCTTGCAACC 


CGTGGTATAC 


900 


CTGAGTAAGG 


AAATTGATGT 


AGTGGCAAAG 


GGTTGGCCTC 


ATNGTTTATG 


GGTAATGGNG 


960 


GCAGTAGCAG 


TCTNAGTATC 


TGAAGCAGTT 


AAAATAATAC 


AGGGAAGAGA 


TCTTNCTGTG 


1020 


TGGACATCTC 


ATGATGTGAA 


CGGCATACTC 


ACTGCTAAAG 


GAGACTTGTG 


GTTGTCAGAC 


1080 


AACCATTTAC 


TTAANTATCA 


GGCTCTATTA 


CTTGAAGAGC 


CAGTGCTGNG 


ACTGCGCACT 


1140 


TGTGCAACTC 


TTAAACCC 










1158 
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(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 297 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 (ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



CCCTTTGCCA CTACATCAAT TTTAGGAGTA 

15 CAAGAACTCA GGATTATCAA TGAGGCTGTT 

TATACAGTGC TTTCCCAAAT ACCAGAGGAA 

GATGCCTTTT TCTGCATCCC TGTACGTCCT 

CCTTTGAACC CAACGTCTCA ACTCACCTGG 



AGGAAACCCA ACGGACAGTG GAGGTTAGTG 60 

GTTCCTCTAT ACCCAGCTGT ACCTAACCCT 120 

GCAGAGTGGT TTACAGTCCT GGACCTTAAG 180 

GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 240 

ACTGTTTTAC CCCAAGGGTT CAAGGGA 297 



20 

(2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 base pairs 
25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 



3 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GTTTAGGGAT ANCCCTCATC TCTTTGGTCA GGTACTGGCC CAAGATCTAG GCCACTTCTC 60 
AGGTCCAGSN ACTCTGTYCC TTCAG 85 



35 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 base pairs 
5 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GTTCAGGGAT AGCCCCCATC TATTTGGCCA GGCACTAGCT CAATACTTGA GCCAGTTCTC 60 
ATACCTGGAC AYTCTYGTCC TTCGGT 86 

15 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 85 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GTTCARRGAT AGCCCCCATC TATTTGGCCW RGYATTAGCC CAAGACTTGA GYCAATTCTC 60 

3 0 ATACCTGGAC ACTCTTGTCC TTYRG 85 

(2) INFORMATION FOR SEQ ID NO: 6: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GTTCAGGGAT AGCTCCCATC TATTTGGCCT GGCATTAACC CGAGACTTAA GCCAGTTCTY 60 
10 ATACG TGG AC ACTCTTGTCC TTTGG 85 



(2) INFORMATION FOR SEQ ID NO: 7: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: cDNA 



25 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GTGTTGCCAC AGGGGTTTAR RGATANCYCY CATCTMTTTG GYCWRGYAYT RRCYCRAKAY 60 
YTRRGYCAVT TCTYAKRYSY RGSNAYTCTB KYCCTTYRGT ACATGGATGA C 111 



3 0 (2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 645 base pairs 

(B) TYPE: nucleotide 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

5 



TCAGGGATAG 


CCCCCATCTA 


TTTGGCCAGG 


CATTAGCCCA 


AGACTTGAGT 


CAATTCTCAT 


60 


ACCTGGACAC 


TCTTGTCCTT 


CAGTACATGG 


ATGATTTACT 


TTTAGTCGCC 


CGTTCAGAAA 


120 


CCTTGTGCCA 


TCAAGCCACC 


CAAGAACTCT 


TAACTTTCCT 


CACTACCTGT 


GGCTACAAGG 


180 


TTTCCAAACC 


AAAGGCTCGG 


CTCTGCTCAC 


AGGAGATTAG 


ATACTNAGGG 


CTAAAATTAT 


240 


CCAAAGGCAC 


CAGGGCCCTC 


AGTGAGGAAC 


GTATCCAGCC 


TATACTGGCT 


TATCCTCATC 


300 


CCAAAACCCT 


AAAGCAACTA 


AGAGGGTTCC 


TTGGCATAAC 


AGGTTTCTGC 


CGAAAACAGA 


360 


TTCCCAGGTA 


CASCCCAATA 


GCCAGACCAT 


TATATACACT 


AATTANGGAA 


ACTCAGAAAG 


420 


CCAATACCTA 


TTTAGTAAGA 


TGGACACCTA 


CAGAAGTGGC 


TTTCCAGGCC 


CTAAAGAAGG 


480 


CCCTAACCCA 


AGCCCCAGTG 


TTCAGCTTGC 


CAACAGGGCA 


AGATTTTTCT 


TTATATGCCA 


540 


CAGAAAAAAC 


AGGAATAGCT 


CTAGGAGTCC 


TTACGCAGGT 


CTCAGGGATG 


AGCTTGCAAC 


600 


CCGTGGTATA 


CCTGAGTAAG 


GAAATTGATG 


TAGTGGCAAA 


GGGTT 




645 



(2) INFORMATION FOR SEQ ID NO: 9: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 741 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
2 5 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



CAAGCCACCC AAGAACTCTT AAATTTCCTC ACTACCTGTG GCTACAAGGT TTCCAAACCA 60 

AAGGCTCAGC TCTGCTCACA GGAGATTAGA TACTTAGGGT TAAAATTATC CAAAGGCACC 120 

AGGGGCCTCA GTGAGGAACG TATCCAGCCT ATACTGGGTT ATCCTCATCC CAAAACCCTA 180 

AAGCAACTAA GAGGGTTCCT TAGCATGATC AGGTTTCTGC CGAAAACAAG ATTCCCAGGT 240 

35 ACAACCAAAA TAGCCAGACC ATTATATACA CTAATTAAGG AAACTCAGAA AGCCAATACC 300 

TATTTAGTAA GATGGACACC TAAACAGAAG GCTTTCCAGG CCCTAAAGAA GGCCCTAACC 360 
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CAAGCCCCAG TGTTCAGCTT GCCAACAGGG 
ACAGGAATCG CTCTAGGAGT CCTTACACAG 
TACCTGAATA AGGAAATTGA TGTAGTGGCA 
GNGGCAGTAG C AGTCTN AG T ATCTGAAGCA 
5 GTGTGGACAT CTCATGATGT GAACGGCATA 
GACAACCATT TACTTAANTA TCAGGCTCTA 
ACTTGTGCAA CTCTTAAACC C 



128 

CAAGATTTTT CTTTATATGG CACAGAAAAA 420 

GTCCGAGGGA TGAGCTTGCA ACCCGTGGCA 480 

AAGGGTTGGC CTCATNGTTT ATGGGTAATG 540 

GTTAAAATAA TACAGGGAAG AGATCTTNCT 600 

CTCACTGCTA AAGGAGACTT GTGGTTGTCA 660 

TTACTTGAAG AGCCAGTGCT GNGACTGCGC 720 

741 



10 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TGGAAAGTGT TGCCACAGGG CGCTGAAGCC TATCGCGTGC AGTTGCCGGA TGCCGCCTAT 60 
AGCCTCTACA TGGATGACAT CCTGCTGGCC TCC 93 

25 

(2) INFORMATION FOR SEQ ID NO: 11: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
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10 



15 



20 



25 



TTGGATCCAG TGYTGCCACA GGGCGCTGAA GCCTATCGCG TGCAGTTGCC GGATGCCGCC 60 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 748 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



TGCAAGCTTC 


ACCGCTTGCT 


GGATGTAGGC 


CTCAGTACCG 


GNGTGCCCCG 


CGCGCTGTAG 


60 


TTCGATGTAG 


AAAGCGCCCG 


GAAACACGCG 


GGACCAATGC 


GTCGCCAGCT 


TGCGCGCCAG 


120 


CGCCTCGTTG 


CCATTGGCCA 


GCGCCACGCC 


GATATCACCC 


GCCATGGCGC 


CGGAGAGCGC 


180 


CAGCAGACCG 


GCGGCCAGCG 


GCGCATTCTC 


AACGCCGGGC 


TCGTCGAACC 


ATTCGGGGGC 


240 


GATTTCCGCA 


CGACCGCGAT 


GCTGGTTGGA 


GAGCCAGGCC 


CTGGCCAGCA 


ACTGGCACAG 


300 


GTTCAGGTAA 


CCCTGCTTGT 


CCCGCACCAA 


CAGCAGCAGG 


CGGGTCGGCT 


TGTCGCGCTC 


360 


GTCGTGATTG 


GTGATCCACA 


CGTCAGCCCC 


GACGATGGGC 


TTCACGCCCT 


TGCCACGCGC 


420 


TTCCTTGTAG 


ANGCGCACCA 


GCCCGAAGGC 


ATTGGCGAGA 


TCGGTCAGCG 


CCAAGGCGCC 


480 


CATGCCATCT 


TTGGCGGCAG 


CCTTGACGGC 


ATCGTCGAGA 


CGGACATTGC 


CATCGACGAC 


540 


GGAATATTCG 


GAGTGGAGAC 


GGAGGTGGAC 


GAAG CGCGGC 


GAATTCATCC 


GCGTATTGTA 


600 


ACGGGTGACA 


CCTTCCGCAA 


AGCATTCCGG 


ACGTGCCCGA 


TTGACCCGGA 


GCAACCCCGC 


660 


ACGGCTGCGC 


GGGCAGTTAT 


AATTTCGGCT 


TACGAATCAA 


CGGGTTACCC 


CAGGGCGCTG 


720 


AAGCCTATCG 


CGTGCAGTTG 


CCGGATGC 








748 



(2) INFORMATION FOR SEQ ID NO: 13: 



TATAGCCTCT ACGTGGATGA CCTSCTGAAG CTTGAG 



96 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GCATCCGGCA ACTGCACG 

10 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GTAGTTCGAT GTAGAAAGCG 

25 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
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GCATCCGGCA ACTGCACG 18 

5 (2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

AGGAGTAAGG AAACCCAA CG GAC 23 

2 0 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

TAAGAGTTGC ACAAGTGCG 19 



3 5 (2) INFORMATION FOR SEQ ID NO: 18: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

( B ) TYPE : nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
TCAGGGATAG CCCCCATCTA T 21 



(2) INFORMATION FOR SEQ ID NO: 19: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
2 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

25 

AACCCTTTGC CACTACATCA ATTT 2 4 



(2) INFORMATION FOR SEQ ID NO: 20: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
3 5 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: CDNA 

( ix ) FEATURES : 

(B) LOCATION: 5, 7, 10, 13 

(D) OTHER INFORMATION: G represents inosine (i) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GGTCGTGCCG CAGGG 15 



10 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 21 base pairs 

<B) TYPE: nucleotide 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

2 0 <ii) MOLECULE TYPE: cDNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

TTAGGGATAG CCCTCATCTC T 21 

25 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TCAGGGATAG CCCCCATCTA T 21 

5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 
10 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
AACCCTTTGC CACTACATCA ATTT 24 

20 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

3 0 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



GCGTAAGGAC TCCTAGAGCT ATT 

35 



23 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
5 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



15 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



TCATCCATGT ACCGAAGG 18 



(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ATGGGGTTCC CAAGTTCCCT 20 



30 (2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

5 

GCCGATATCA CCCGCCATGG 20 



(2) INFORMATION FOR SEQ ID NO: 28: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

20 

GCATCCGGCA ACTGCACG 18 



(2) INFORMATION FOR SEQ ID NO: 29: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
3 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

35 

CGCGATGCTG GTTGGAGAGC 



20 
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(2) INFORMATION FOR SEQ ID NO: 30: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
15 TCTCCACTCC GAATATTCCG 20 



(2) INFORMATION FOR SEQ ID NO: 31: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

3 0 GATCTAGGCC ACTTCTCAGG TCCAGS 26 



(2) INFORMATION FOR SEQ ID NO: 32: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(ix) FEATURES: 

<B) LOCATION: 6, 12, 19 

(D) OTHER INFORMATION: G represents inosine (i) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 32 
CATCTGTTTG GGCAGGCAGT AGC 2 3 



15 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

20 { B ) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CTTGAGCCAG TTCTCATACC TGGA 24 

30 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 
3 5 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

AGTGYTRCCM CARGGCGCTG AA 22 



10 (2) INFORMATION FOR SEQ ID NO: 35: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GMGGCCAGCA GSAKGTCATC CA 22 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

35 

GGATGCCGCC TATAGCCTCT AC 22 



25 



30 
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(2) INFORMATION FOR SEQ ID NO: 37: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 
( 8 ) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
15 AAGCCTATCG CGTGCAGTTG CC 2 2 

(2) INFORMATION FOR SEQ ID NO: 38: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear . 

25 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

3 0 TAAAGATCTA GAATTCGGCT ATAGGCGGCA TCCGGCAAGT 40 

(2) INFORMATION FOR SEQ ID NO: 39 

35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 50 amino acids 
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(B) TYPE : amino acid 

(ii) MOLECULE TYPE : peptide 

5 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 39 

Asp Ala Phe Phe Cys lie Pro Val Arg Pro Asp Ser Gin Phe Leu Phe 

15 10 15 

Ala Phe Glu Asp Pro Leu Asn Pro Thr Ser Gin Leu Thr Trp Thr Val 
10 20 25 30 

Leu Pro Gin Gly Phe Arg Asp Ser Pro His Leu Phe Gly Gin Ala Leu 
35 40 45 

Ala Gin 
50 

15 

(2) INFORMATION FOR SEQ ID NO: 40 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 150 base pairs 

20 (B) TYPE : nucleic acid 

<C) STRAND EDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : cDNA 

25 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 40 

GATGCCTTTT TCTGCATCCC TGTACGTCCT GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 60 
CCTTTGAACC CAACGTCTCA ACTCACCTGG ACTGTTTTAC CCCAAGGGTT CAGGGATAGC 12 0 
3 0 CCCCATCTAT TTGGCCAGGC ATTAGCCCAA 150 

(2) INFORMATION FOR SEQ ID NO: 41 



35 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 11 amino acids 
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(B) TYPE : amino acid 

(ii) MOLECULE TYPE : peptide 

5 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 41 

Cys lie Pro Val Arg Pro Asp Ser Gin Phe Leu 
15 10 

10 (2) INFORMATION FOR SEQ ID NO: 42 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 17 amino acids 

(B) TYPE : amino acid 

15 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 42 

20 Val Leu Pro Gin Gly Phe Arg Asp Ser Pro His Leu Phe Gly Glu Ala 
15 10 15 

Leu 
17 

25 

(2) INFORMATION FOR SEQ ID NO: 43 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 8 amino acid 
3 0 (B) TYPE : amino acid 

(ii) MOLECULE TYPE : peptide 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 43 

35 

Leu Phe Ala Phe Glu Asp Pro Leu 
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1 5 8 

(2) INFORMATION FOR SEQ ID NO: 44 

5 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 8 amino acids 

(B) TYPE : amino acid 

10 (ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 

Phe Ala Phe Glu Asp Pro Leu Asn 
15 1 5 8 

(2) INFORMATION FOR SEQ ID NO: 45 

2 0 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 25 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

25 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 

3 0 GTGCTGATTG GTGTATTTAC AATCC 

(2) INFORMATION FOR SEQ ID NO: 46 



35 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1859 base pairs 
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(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

5 (ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 46 





GTGCTGATTG 


GTGTATTTAC 


AATCCTTTAT 


CTAATCCGAA 


ATGCCCATGT 


TGCAATATGG 


60 


10 


AAAGAAAGGG 


AGTTCCTAAC 


CTCTGGGGGA 


ACCCCCATTA 


AATACCACAA 


GTAAATCATG 


120 




GAGTTATTGC 


ACACAGTGCA 


AAAACTCAAG 


GAGGTGGAAG 


TCTTACACTG 


CCAAAGCCAT 


180 




CAGAAAAGGG 


AAGAGGGGAG 


AAGAGCAGCA 


TAAGTGGCTA 


CAGAGGCAAG 


GAAAGACTAG 


240 




CAGAAAGGAA 


AGAGAGAAAG 


AGACAGAAAG 


TCAGAGAGAG 


AGAGAGGAAG 


AGACAGAGCA 


300 




CAAAGAGGGA 


GTCAGAGAGA 


GAGAGAGACA 


GAGAGTCAGA 


GAGAAGGAAA 


GAGAGAGAGG 


360 


15 


AAGAGACAAA 


GAATGAATCA 


AACAGAGAGA 


CAGAAAGTCA 


GAGAGAGAGA 


GAGAGAGGAA 


420 




GAGACAGAGA 


AAAAGAGGGA 


G T C AG AA AAA 


GAGAGACCAA 


AGAAGAAGTC 


CAAAGAGAAA 


480 




GAAAGAGAGA 


TGGAAGTAGT 


AAAGGAAAAA 


CAGTGTACCC 


TATTCCTTTA 


AAAGCCGGGG 


540 




TAAATTTAAA 


ACCTATAATT 


GATAACTGAA 


GGTCTTCTCT 


GTAACCCTGT 


AACACTCCAA 


600 




TACCACCTTG 


TTGTCAAGTG 


TAAACAAGGG 


CGTAGCCCAA 


AAGCACTGAG 


GCCACTAACA 


660 


20 


ACCCATAGCC 


TTCCTATCAA 


AATTCCTTAA 


CCCAGCAGGT 


TTCCTAACAG 


GGGATCTAAA 


720 




TCTTAATTAA 


TTACCATACA 


ATGGTCCAAC 


CAGACTTAGG 


AGGAATTCCC 


TTCAGGACGG 


780 




GAAGATAGAT 


GCTTCCTCCC 


AGGCGATTAA 


GGGAGAAAGA 


CACAATGGGT 


ATTCAGTAAG 


840 




TGCCAAGGGG 


AACACTTGTA 


GAAGCAAAGT 


TAGGAAAATT 


GCCAAATAAT 


TGGTTTGCTC 


900 




AAGAGTTGTT 


TGCACTCAGC 


CAAACCTTGA 


AGTACTTGCA 


GAATCAGAAA 


GGAGCCATCT 


960 


25 


ATACCAATTC 


TAAGTTAATA 


TGGACTGAAG 


GAGGTTTTAT 


TAATACCAAA 


GAGAAATTAA 


1020 




AATCCCAAAC 


TTATAAGGTT 


TTCAACCAAA 


GTAAAGTTTG 


CTAAAAGTTA 


ACAGCGTAAC 


1080 




ATGTATTATC 


CTACTACCAC 


ACACTCTCAA 


AGGATTTCTC 


AGACAGTTTG 


CAAGAAATAA 


1140 




TGATATCTAT 


CCTTACTCTA 


CAATCCCAAA 


TAGACTCTTT 


GGCAGCAGTG 


ACTCTCCAAA 


1200 




ACCGTCAAGG 


CCTAGACCTC 


CTCACTGCTG 


AGAAAGGAGG 


ACTCTGCACC 


TTCTTAAGGG 


1260 


30 


AAGAGTGTTG 


TCTTTACACT 


AACCAGTCAG 


GGATAGTATG 


AGATGCTGCC 


CGGCATTTAC 


1320 




AG AAAAAGG C 


TTCTGAAATC 


AGACAACGCC 


TTTCAAATTC 


CTATACCAAC 


CTCTGGAGTT 


1380 




GGGCAACATG 


GTTTCTTCCC 


TTTCTATGTC 


CCATGGCTGC 


CATCTTGCTA 


TTACTCGCCT 


1440 




TTGGGCCCTG 


TATTTTTAAC 


CTCCTTGTCA 


AATTTGTTTC 


TTCTAGGATC 


GAGGCCATCA 


1500 




AGCTACAGAT 


GGTCTTACAA 


ATGGAACCCC 


AAATGAGCTC 


AACTATCAAC 


TTCTACTGAG 


1560 


35 


GACCCCTAGA 


CCAACCCCCT 


GGCCCTTTCA 


CTGGCCTAAA 


GAGTTCCCCT 


CTGGAGGACA 


1620 




CTACCACTGC 


AGGGCCCCAT 


CTTTGCCCCT 


ATCCAGAAGG 


AAGTAGCTAG 


AGCAGTCATT 


1680 
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GCCCAATTCC CAAGAGCAGC TGGGGTGTCC CGTTTAGAGT GGGGATTGAG AGGTGAAGCC 1740 

AGCTGGACTT CTGGGTCGGG TGGGGACTTG GAGAACTTTT GTGTCTAGCT AAAGGATTGT 1800 

AAATGCAACA ATCAGTGCTC TGTGTCTAGC TAAAGGATTG TAAATACACC AATCAGCAC 1859 



(2) INFORMATION FOR SEQ ID NO: 47 

(i) SEQUENCE CHARACTERISTICS : 
10 (A) LENGTH : 23 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

15 (ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 47 
TGATGTGAAC GGCATACTCA CTG 2 3 

20 

(2) INFORMATION FOR SEQ ID NO: 48 

(i) SEQUENCE CHARACTERISTICS : 
25 (A) LENGTH : 24 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

3 0 (ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 48 



CCCAGAGGTT AGGAACTCCC TTTC 

35 



24 
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(2) INFORMATION FOR SEQ ID NO: 49 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 25 base pairs 
5 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



10 



15 



25 



30 



(ii) MOLECULE TYPE : cDNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 49 



GCTAAAGGAG ACTTGTGGTT GTCAG 2 5 



(2) INFORMATION FOR SEQ ID NO: 50: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 



CAACATGGGC ATTTCGGATT AG 22 



(2) INFORMATION FOR SEQ ID NO: 51: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 400 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



10 



15 



20 



25 



30 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

GGCTGCTAAA GGAGACTTGT GGTTGTCAGA CAATCGCCTA CTTAGGTACC AGGCCTTATT 60 

ACTTGAGGGA CTGGTGCTTC AGATGCGCAC TTGTGCAGCT CTTAACCCAA ACTTATGCTG 120 

CCCAGAAGGA TCTTTTAGAG GTCCCCTTAG CCAACCCTGA CCTCAACCTA TATATATACT 180 

GATGGAAGTT CGTTTGTAGA AAAGGGATTA CAAAGGG NAG GATATNCCAT AGGTTAGTGA 240 

TAAAGCAGTA CTTGAAAGTA AGCCTCTTCC CCCCAGGGAC CAGCGCCCCC GTTAGCAGAA 300 

CTAGTGGCAC TGACCCCGAG CCTTAGAACT TGGAAAGGGA GGAGGATAAA TGTGTATACA 360 

GATAGCAAGT ATGCTTATCT AATCCGAAAT GCCCATGTTG 400 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2389 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

TCAGGGATAG CCCCCATCTA TTTGGTCAGG CACTGGCCCA AGATCTAGGG ACATGCCACT 60 

TTTAAGAGCC ATTTCTCAAG TCCAGGTACT CTGGTCCTTC GGTATGTGGA TGATTTACTT 120 

TTGGCTACCA GTTCAGTAGC CTCATGCCAG CAGGCTACTC TAGATCTCTT GAACTTTCTA 180 

GCTAATCAAG GGTACAAGGC ATCTAGGTTG AAGGCCCAGC TTTGCCTACA GCAGGTCAAA 240 

TATCTAGGCC TAATCTTAGC CAGAGGGACC AGGGCACTCA GCAAGGAACA AATACAGCCT 300 

ATACTGGCTT ATCCTCACCC TAAGACATTA AAACAGTTGC GGGGGTTCCT TGGAATCACT 360 

GGCTTTTTGG TGACTATGGA TTCCCAGATA CAGCAAGATT GGCAGGCCCC TCTATACTGT 420 

AATCAAGGAG ACTCACGAGG GCAAGTACTC ATCTAGTAGA ATGGGAACTA GGGACAGAAA 480 

CAGCCTTCAA AACCTTAAAG CAGGCCCTAG TACAATCTCC AGCTTTAAGC CTTCCCACAG 540 
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GACAAAACTT 


CTCTTTATAC 


ATCACAGAGA 


GGG C AG AG AT 


AGCTCTTGGT 


GTCCTTATTC 


600 




AGACTCATGG 


GACTACCCCA 


CAACCAGTGG 


CACACCTAAG 


TAAGGAAATT 


GATGTAGTAG 


660 




CAAAAGGCTG 


GCCTCACTGT 


TTATGGGTAG 


CTGTGGTGGT 


GGCTGTCTTA 


GTGTCAGAAG 


720 




CTATCAAAAT 


AATACAAGGA 


AAGGATCTCA 


CTGTCTGGAC 


TACTCATGAT 


GTAATGGCAT 


780 


5 


ACTAGGTGCC 


AAAAGAAGTT 


TATGGGTATC 


AGACAACCAC 


CTGCTTAGAT 


ACCAGGGACT 


840 




ACTCCTGGAG 


GATTGGGCTT 


CAAGTGCGTT 


TTTTGTGGCC 


TCAACCCTGC 


CACTTTTCCT 


900 




CCAGAGGATG 


GAGAGCCGCT 


TGAGCATGCT 


TGCCAACAGG 


TTGTAGGCCA 


GAATTATTCC 


960 




ACCCGAGATG 


ATCTCTTAGA 


GTACCCTTAG 


CTAATCCTGA 


CCTTAACCTA 


TATACCAATG 


1020 




GAAGTTCATT 


TGTGGAAAAC 


GGGATATGAA 


GGGCAGGTTA 


TGTCATAGTT 


AGTGATGTAA 


1080 


10 


TCATACTTGC 


AAGTAAGCCT 


CTTACCCCAG 


GGGCCAGCAC 


TCAGTTAGCA 


GAACTAGTCA 


1140 




CACTTACCTT 


AACCTTAGAA 


CTGGGAAAGG 


GAAAAAGAAT 


AAATATGTAT 


ACAGATAGTA 


1200 




AGTATGCTTA 


TCTAATCCTA 


CATGCCCATG 


CTGCAATATG 


GAAGGAAAGG 


GAGTTCCTAA 


1260 




CCCCTGGGGG 


AACCCCCATT 


AAATACCACA 


AGGYAAATCA 


TGGAGTTATT 


GCACGCAGTG 


1320 




CAAAAACTCA 


AGGAGGTGGC 


AGTCTTACAC 


TGCCGAAGCY 


ATCAAAAAGG 


GGAAGGAGAG 


1380 


15 


GGGAGAACAG 


CAGCATAAGT 


GGTTGGCAGA 


GGCAGTGAAA 


GACCAGCAGA 


GAGAAGGAGA 


1440 




G AG AC AACG T 


CAACGACAGA 


AGGAAAGAAG 


AGGAGGAGAC 


AGAGAGGAAG 


AGACAGAGAG 


1500 




ACAGTTAGTC 


CAAGAGAGAG 


ACAGAGAGAG 


GAAGAGACAG 


ACAGAAAGTC 


CAAGAGAGAA 


1560 




GGAAAGAGAG 


GAAGAGACCA 


AGGAGTCCNA 


GAG AG AG AAA 


GAGATAGAAG 


TAGTAAAGAA 


1620 




AAAACATTGT 


ACCCTATTCC 


TTTAAAAGCC 


GGGGTATATT 


TAAAACCTAT 


AATTGATAAT 


1680 


20 


TGAGTTCTTG 


CACCCTCCTC 


CAGGGGATYG 


CTGGGAGGAA 


ACCCTCAACC 


GATATGTGAA 


1740 




AATTGTGGGT 


CGTCCCTATG 


TCTCAATTAC 


CAGCCAATAC 


CCCCTTGTTT 


TTAGTGTGAA 


1800 




CGAGGGTGTA 


GAGCGCAGAC 


AGGGAGACCT 


CTGACAATCC 


ATACCCTTCC 


TATCCAAAAT 


1860 




CCTTAACCCA 


GCAGGTTTTC 


TAAAAGGGGA 


TCTAAATCTT 


AATTAATTAC 


CATACAAAGG 


1920 




TCAAACCAGA 


TCTAGGAGGA 


ACTTCCTTCA 


GGACAGGATG 


ATAGATGGTT 


CCTCCCAGGC 


1980 


25 


GATTAAAGAA 


AATAAAAAGA 


CACATGGGCA 


GCCAGTAAGT 


GATAAGGGAA 


CACTAGTAGA 


2040 




AGCAGTTAGG 


AGAAGTTGCC 


TAATAATTGG 


TCTACTCCAA 


ATGTGTGAGT 


TGTTCGCACT 


2100 




CAGCCCAAAT 


CTTAAAGTAC 


TTACAGAATT 


AGGGAGGAGC 


CATTTACACC 


AATTCTAAGT 


2160 




TAATATGGAC 


TGGATGAGGT 


TTTATTAATA 


GCGAAGGAGA 


ATTAAATCCT 


AAACTNACAA 


2220 




GGTTTTCAAC 


TAAAGTAAAT 


TTTACTAAAA 


GCTAACAGTG 


TAACATGCAT 


TATCCTACTA 


2280 


30 


CAACACACTC 


TCANAGGATT 


CCTCAGACAG 


TTTACAAGAA 


ATAACAAAAT 


CTATCTGGTA 


2340 




AGGATAGTAA 


CTACAATCCC 


AAATACATTC 


TTTGGCAGCA 


GTGACTCTC 




2389 



35 



(2) INFORMATION FOR SEQ ID NO: 53: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2448 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



10 


TCAGGGATAG 


CCCCCATCTA 


TTTGATCAGG 


CACTAGCCCA 


AGATCTAGGC 


CACTTCTGAA 


60 




GTCCAGGCAT 


TCTAGTCCTT 


CAGTATGTGG 


ATGATTTACT 


TTTGGCTACC 


AGTTTGGAAG 


120 




CCTCATGCCA 


GCAGGCTACT 


TGAGATCTCT 


TGAACTTTCT 


AGCTAATCAA 


GGGTGTATGG 


180 




CATCTAAATT 


GAAAGTCCAG 


CTCTGCCTAC 


AACAAGTCAA 


ATATCTAGGC 


CTAATCTTAG 


240 




ATAGAAGAAC 


CAGGGCCCTC 


AGCAAGGAAT 


GAATAAAGCC 


TATGCTGGCT 


TATCGGCACC 


300, 


15 


CTAAGACATT 


AAAACAATTG 


TGGGGGTTCC 


TTGGAATCAC 


TGGCTTTTGC 


CGACTATGGA 


360 




TCCCTGGATA 


GAGTGAGATA 


GCCAGGCCCC 


CTCTATTACT 


CTTATCAAGG 


AGACCCAGAG 


420 




GGCAAATACT 


TATCTAGTAT 


TATGGGNACC 


AGAGGCAGAA 


AAAGCCTTCC 


AAACCTTAAA 


480 




GGAGACCCTA 


GTACAAGCTC 


CAGCTTTAAG 


CCTTCCCACA 


GGACAAANCT 


TCTCTTTATA 


540 




TGTCACAGAG 


AGAGCAGGAA 


TAGCTCCTGG 


AGTCCTTACT 


CAGACTTTTG 


GACGACCCCA 


600 


20 


CGGCCAGTGG 


CRTACCTAAG 


TAAGGAAATT 


GATGTAGTAG 


CAAAAGGCTG 


GCCTCACTGT 


660 




TTATGGGTAG 


TTGCGGCTGT 


GGCAGTCTTA 


CTGTCAAAGG 


CTATCAAAAT 


AATACAAGGA 


720 




AAGGATTTCA 


CTATCTGGAC 


TACTCATGAG 


GAAAATGGCA 


TATTAGGTGC 


CAAAGGAAGT 


780 




TTTTGGCTAT 


CAGACAACCA 


CCTGCTCAGA 


TTCCAGGCAC 


TACTGATTGA 


GAGACCAGTG 


840 




CTTTAAATAT 


GTATGTGTGT 


GTGTGGCCCT 


CAACCCTGCC 


ACTGTTCTCC 


CAGAAGATGG 


900 


25 


AGAACCAATG 


AAGCATTACT 


GTCAACAAAT 


TAGAGTCCAG 


AGTTATGCTG 


CCTGAGAGGA 


960 




TCTCTTAGAA 


GTCCCCTTAG 


CTAATCCTGA 


CCTTAACCTA 


TATGCTGATG 


GAAGTTCACT 


1020 




TGTGGAGAAT 


GGGATACGAA 


AAGCACATTA 


TGCCATAGTT 


AGTGAGGTAA 


CAGTACTTGA 


1080 




AAGTAAGCCT 


ATTCCCCCAT 


GGACCAGAGC 


CCAGTTAGCA 


GAACTAGTGG 


CACTTACCCA 


1140 




AGCCTTAGAA 


CTAGGAAAGG 


GAAAAATAAT 


AAATGTGTAT 


ACAGATAGCA 


AGTATGCTTA 


1200 


30 


TCTAATCCTA 


CATGCCCATG 


CTGCAGTATG 


GAAAGAAAGG 


GAGTTCCTAA 


CCTCTGGGGG 


1260 




AACCCCCATT 


AAATACCACA 


AGGCAAATCA 


TGGAGTTATT 


GCATGTAGTG 


CAAAACCTCA 


1320 




AGTAGGTGGC 


AGTTTTACAC 


TGCCTGAAGC 


TATGGGGAAG 


GAGAGAGGAG 


AACAGCAGCA 


1380 




TAAGTGGCTA 


GCAGAGGCAG 


CGAAAGACTA 


GCAGAGAGGA 


GAGGTAGGGG 


AAAGACAGAA 


1440 




AGTCAAAGAA 


AAGAAGTCAA 


AGACAGACAG 


AGAAAGAGAC 


AGAGGGAGCC 


AGAGAGAAAG 


1500 


35 


AAAAGAGAGA 


ACGAAAGAGA 


CAGAATGTCA 


AAGAACAGAA 


GAGAGAGGCA 


GCGCCAGAAG 


1560 




AGTTAAGAAA 


GTGAGAAAGA 


GAGATGGAAA 


TAGTAAAGAA 


AAAACAGTGT 


ACCCTATTCC 


1620 
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TTTAAAAGCC 


AGGGTAAATT 


TAAAACGTAT 


AATTTTATAA 


TTGGAAGGTC 


TTCTCCATAA 


1680 


CCCTATAACA 


TTAAAATACC 


ACCTTGTTGT 


CAGTGTAAAC 


AAGAGCATAG 


CCCAAAAGCA 


1740 


CTGAGGCCAC 


TGACAACCCA 


TAGCCTTCCT 


ATCAAAAATC 


CTTAACTCTG 


CAGGTTTCCT 


1800 


AACAGGGGAT 


CTAAATCTCA 


ACTAATCACC 


ATACAATGGT 


CCGACCAGAC 


CTAGGAGCGA 


1860 


CTCCCCTCAG 


GACAGAAGGA 


TGGATGGTTC 


CTCCCAGGCC 


ATTAAGGGAA 


AGAGACACAA 


1920 


TGGGTATTCA 


GTAAGTGATA 


AGGGAACTCT 


TGTAGAAGCA 


GTTAGGAAGA 


TTGCCTAATA 


1980 


TTTGGTCTGC 


TCAAATGTGC 


CAGCTGTTTG 


CACTCAGCTA 


AACCTTAAAT 


TACTTACAGA 


2040 


ATTAGGAAGG 


AGCCATCTAT 


ACCAATTCTG 


AGTTAATATG 


AGCTGAACAA 


GTTCTTATTA 


2100 


ATAGCAAAGA 


ATCATTGAAA 


TCTCAAACTT 


GCAAAGTTTT 


CAACAAAAGT 


AAAGTTTGCT 


2160 


GAAAGTTAGC 


AGTGTAACAT 


GTATTATCCT 


AACTTCTAAT 


CTTGTGGAAA 


TCAGACCCTA 


2220 


TCAGTGCCCC 


TCAAAGCTGA 


AGTCCATCAG 


CATATGGCCA 


TACAACTAAT 


ACCCCTATTT 


2280 


ATAGGGTTAG 


GAATGGCCAC 


TGCTACAGGA 


ATGGGAGTAA 


CAGGTTTATC 


TACTTCATTA 


2340 


TCCTATTACC 


ACACACTCTT 


AAAGGATTTC 


TCAGACAGTT 


TACAAGAAAT 


AACAAAATCT 


2400 


ATCCTTACTC 


TNTARTCCCA 


AATAGRTTCT 


TTGGCAGCAG 


TGACTCTC 




2448 



15 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CCTGAGTTCT TGCACTAACC C 21 

30 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
GTCCGTTGGG TTTCCTTACT CCT 2 3 

(2) INFORMATION FOR SEQ ID NO: 56: 



10 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1196 base pairs 
15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 





TTCCTGAGTT 


CTTGCACTAA 


CCTCAAATGA 


GAGAAGTGCC 


GCCATAACTG 


CAACCCAAGA 


60 




GTTTGGCGAT 


CCCTGGTATC 


TCAGTCAGGT 


CAATGACAGG 


ATGACAACAG 


AGGAAAGATA 


120 


25 


ATGATTCCCC 


ACAGGCCAGC 


AGGCAGTTCC 


CAGTGTAGAC 


CCTCATTAGG 


ACACAGAATC 


180 




AGAACATGGA 


GATTGGTGCC 


GCAGACATTT 


GCTAACTTGC 


GTGCTAGAAG 


GACTAAGGAA 


240 




AACTAGGAAG 


ATATGAATTA 


TTCAATGATG 


TCCACTATAA 


CACAGGGGAA 


AGGAAGAAAA 


300 




TCCTACTGCC 


TTTCTGGAGA 


GACTAAGGGA 


GGCATTGAGG 


AAGCATACCA 


GGCAAGTGGA 


360 




CATTGGAGGC 


TCTGGAAAAG 


GGAAAAGTTG 


GGAAAAGTAT 


ATGTCTAATA 


GGGCTTGCTT 


420 


30 


CCAGTGTGGT 


CTACAAGGAC 


ACTTTAAAAA 


AGATTGTCCA 


ATAGAAATAA 


GCCACCACCT 


480 




CGTCCATGCC 


CCTTATGTCA 


AGGGAATCAC 


TGGAAGGCCC 


ACTGCCCCAG 


GGGATGAAGG 


540 




TCCTCTGAGT 


CAGAAGCCAC 


TAACCAGATG 


ATCCAGCAGC 


AGGACTGAGG 


GTGCCCGGGG 


600 




CAAGCGCCAG 


CCCATGCCAT 


CACCCTCACA 


GAGCCCCAGG 


TATGCTTGAC 


CATTGAGGGT 


660 




CAGAAGGGTA 


CTGTCTCCTG 


GACACTGGCG 


GGCCTTCTCA 


GTCTTACTTT 


CCTGTCCTGG 


720 


35 


ACAACTGTCC 


TCCAGATCTG 


TCACTGTCCG 


AGGGGTCCTA 


GGACAGCCAG 


TCACTAGATA 


780 




CTTCTCCCAG 


CCACTAAGTT 


GTGACTGGGG 


AACTTTACTC 


TTCCACATGC 


TTTTCTAATT 


840 



WO 98/23755 
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ATGCCTGAAA GCCCCACTCT CTTGTTAGGG 

TATACATGTG AATATAGGAG AAGGAACAAC 

TAATCCTGAA GTCCGGGCAA CAGAAGGACA 

TCAAGTTAAA CTAAAGGATT CCACCTCCTT 

5 CGAGACCCAA CAAGAACTCC AAAAGATTGT 

ACCAAGCAAT AGCCCTTGCA AGACTCCAAT 



152 

GAGAGACATT CTAGCAAAAG CAGGGGCCAT 900 
TGTTTGTTGT CCCCTGCTTG AGGAAGGAAT 960 
ATATGGACAA GCAAAGAATG CCCGTCCTGT 1020 
TCCCTACCAA AGGCAGTACC CCCTCAGACC 1080 
AAAGGACCTA AAAGCCCAAG GCCTAGTAAA 1140 
TTTAGGAGTA AGGAAACCCA ACGGAC 1196 



(2) INFORMATION FOR SEQ ID NO: 57: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2391 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

20 



ATGATCCAGC 


AGCAGGACNG 


AGGGTGCCCG 


GGGCAAGCGC 


CAGCCCATGC 


CATCACCCTC 


60 


ACAGAGCCCC 


AGGTATGCTT 


GACCATTGAG 


GGTCAGAAGG 


GTNACTGTCT 


CCTGGACACT 


120 


GGCGGNGCCT 


TCTCAGTCTT 


ACTTTCCTGT 


CCTGGACAAC 


TGTCCTCCAG 


ATCTGTCACT 


180 


GTCCGAGGGG 


TCCTAGGACA 


GCCAGTCACT 


AGATACTTCT 


CCCAGCCACT 


AAGTTGTGAC 


240 


TGGGGAACTT 


TACTCTTCCC 


ACATGCTTTT 


CTAATTATGC 


CTGAAAGCCC 


CACTCTCTTG 


300 


TTGGGGAGAG 


ACATTCTAGC 


AAAAGCAGGG 


GCCATTATAC 


ATGTGAATAT 


AGGAGAAGGA 


360 


ACAACTGTTT 


GTTGTCCCCT 


GCTTGAGGAA 


GGAATTAATC 


CTGAAGTCCG 


GGCAACAGAA 


420 


GGACAATATG 


GACAAGCAAA 


GAATGCCCGT 


CCTGTTCAAG 


TTAAACTAAA 


GGATTCCACC 


480 


TCCTTTCCCT 


ACCAAAGGCA 


GTACCCCCTC 


AGACCCGAGA 


CCCAACAAGA 


ACTCCAAAAG 


540 


ATTGTAAAGG 


ACCTAAAAGC 


CCAAGGCCTA 


GTAAAACCAA 


GCAATAGCCC 


TTGCAAGACT 


600 


CCAATTTTAG 


GAGTAAGGAA 


ACCCAACGGA 


CAGTGGAGGT 


TAGTGCAAGA 


ACTCAGGATT 


660 


ATCAATGAGG 


CTGTTGTTCC 


TCTATACCCA 


GCTGTACCTA 


ACCCTTATAC 


AGTGCTTTCC 


720 


CAAATACCAG 


AGGAAGCAGA 


GTGGTTTACA 


GTCCTGGACC 


TTAAGGATGC 


CTTTTTCTGC 


780 


ATCCCTGTAC 


GTCCTGACTC 


TCAATTCTTG 


TTTGCCTTTG 


AAGATCCTTT 


GAACCCAACG 


840 


TCTCAACTCA 


CCTGGACTGT 


TTTACCCCAA 


GGGTTCAGGG 


ATAGCCCCCA 


TCTATTTGGC 


900 


CAGGCATTAG 


CCCAAGACTT 


GAGTCAATTC 


TCATACCTGG 


ACACTCTTGT 


CCTTCAGTAC 


960 
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ATGGATGATT 


TACTTTTAGT 


CGCCCGTTCA 


GAAACCTTGT 


GCCATCAAGC 


CACCCAAGAA 


1020 


CTCTTAACTT 


TCCTCACTAC 


CTGTGGCTAC 


AAGGTTTCCA 


AACCAAAGGC 


TCGGCTCTGC 


1080 


TCACAGGAGA 


TTAGATACTN 


AGGGCTAAAA 


TTATCCAAAG 


GCACCAGGGC 


CCTCAGTGAG 


1140 


GAACGTATCC 


AGCCTATACT 


GGCTTATCCT 


CATCCCAAAA 


CCCTAAAGCA 


ACTAAGAGGG 


1200 


TTCCTTGGCA 


TAACAGGTTT 


CTGCCGAAAA 


CAGATTCCCA 


GGTACASCCC 


AATAGCCAGA 


1260 


CCATTATATA 


CACTAATTAN 


GGAAACTCAG 


AAAGCCAATA 


CCTATTTAGT 


AAGATGGACA 


1320 


CCTACAGAAG 


TGGCTTTCCA 


GGCCCTAAAG 


AAGGCCCTAA 


CCCAAGCCCC 


AGTGTTCAGC 


1380 


TTGCCAACAG 


GGCAAGATTT 


TTCTTTATAT 


GCCACAGAAA 


AAACAGGAAT 


AGCTCTAGGA 


1440 


GTCCTTACGC 


AGGTCTCAGG 


GATGAGCTTG 


CAACCCGTGG 


TATACCTGAG 


TAAGGAAATT 


1500 


GATGTAGTGG 


CAAAGGGTTG 


GCCTCATNGT 


TTATGGGTAA 


TGGNGGCAGT 


AGCAGTCTNA 


1560 


GTATCTGAAG 


CAGTTAAAAT 


AATACAGGG A 


AGAGATCTTN 


CTGTGTGGAC 


ATCTCATGAT 


1620 


GTGAACGGCA 


TACTCACTGC 


TAAAGGAGAC 


TTGTGGTTGT 


CAGACAACCA 


TTTACTTAAN 


1680 


TATCAGGCTC 


TATTACTTGA 


AGAGCCAGTG 


CTGNGACTGC 


GCACTTGTGC 


AACTCTTAAA 


1740 


CCCAAACTTA 


TGCTGCCCAG 


AAGGATCTTT 


NTAGAGGTCC 


CCTTAGCCAA 


CCCTGACCTC 


1800 


AACTATATAT 


ATACTGATGG 


AAGTTCGTTT 


GTAGAAAAGG 


GATTACAAAG 


GGNAGGATAT 


1860 


NCCATAGGTG 


TTAGTGATAA 


AGCAGTACTT 


GAAAGTAAGC 


CTCTTCCCCC 


CCAGGGACCA 


1920 


GCGCCCCCGT 


TAGCAGAACT 


AGTGGCACTG 


ACCCCGCGAG 


CCTTAGAACT 


TTGGAAAGGG 


1980 


AGGAGGATAA 


ATGTGTATAC 


AGATAGCAAG 


TATGCTTATC 


TAATCCGAAA 


TGCCCATGTT 


2040 


GTTTATCTAA 


TCCGAAATGC 


CCATGTTGCA 


ATATGGAAAG 


AAAGGGAGTT 


CCTAACCTCT 


2100 


GGGGGAACCC 


CCATTAAATA 


CCACAAGTTA 


ATCATGGAGT 


TATTGCACAC 


AGTGCAAAAA 


2160 


CTCAAGGAGG 


TGGAAGTCTT 


ACACTGCCAA 


AGCCATCAGA 


AAAGGGAAAG 


GGGAGAAGAG 


2220 


CAGCATAAGT 


GGCTACAGAG 


GCAAGGAAAG 


ACTAGCAGAA 


AGGAAAGAGA 


GAAAGAGACA 


2280 


GAAAGTCAGA 


GAGAGAGAGA 


GGAAGAGACA 


GAGCACAAAG 


AGGGAGTCAG 


AGAGAGAGAG 


2340 


AGACAGAGAG 


TCAGAGAGAA 


GGAAAGAGAG 


AGAGGAAGAG 


ACAAAGAATG 


A 


2391 



25 



(2) INFORMATION FOR SEQ ID NO: 58: 



(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1722 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 





TGGAGAATAG 


CAGCATAAGT 


TGGCTGGCAG 


AAGTAGGGAA 


AGACAGCAAG 


AAGTAAAGAA 


60 




AAAAARGAGA 


AAGTCAGAGA 


AAGAAAAAAA 


GAGAGGAAGA 


AACAAAGAAG 


AACTTGAAGA 


120 


5 


GAGAAAGAAG 


TAG T AAAG AA 


AAAACAGTAT 


ACCCTATTCC 


TTTAAAAGCC 


AGGGTAAATT 


180 




TCTGTCTACC 


TAGCCAAGGC 


ATATTCTTCT 


TATGTGGAAC 


ATCAACCTAT 


ATCTGCCTCC 


240 




CCACTAACTG 


GACAGGCACC 


TGAACCTTAG 


TCTTTCTAAG 


TCCCAACATT 


AACATTGCCC 


300 




CAGGAAATCA 


GACCCTATTG 


GTACCTGTCA 


AAGCTAAAGT 


CCCGTCAGTG 


CAGAGCCATA 


360 




CAACTAATAT 


CCCTATTTAT 


AGGGTTAGGA 


ATGGCTACTG 


CTACAGGAAC 


TGGAATAGCC 


420 


10 


GGTTTATCTA 


CTTCATTATC 


CTACTACCAT 


ACACTCTCAA 


AGAATTTCTC 


AGACAGTTTG 


480 




CAAGAAATAA 


TGAAATCTAT 


TCTTACTTTA 


CAATCCCAAT 


TAGACTCTTT 


GGCAGCAATG 


540 




ACTCTCCAAA 


ACCGCCGAGG 


CCCACACCTC 


CTCACTGCTG 


AGAAAGGAGG 


ACTCTGCACC 


600 




TTCTTAGGGG 


AAGAGTGTTG 


TTTTTACACT 


AACCAGTCAG 


GGATAGTACG 


AGATGCCACC 


660 




TGGCATTTAC 


AGGAAAGGGC 


TTCTGATATC 


AGACAATGCC 


TTTCAAACTC 


TTATACCAAC 


720 


15 


CTCTGGAGTT 


GGGCAACATG 


GCTTCTTCCA 


TTTCTAGGTC 


CCATGGCAGC 


CATCTTGCTG 


780 




TTACTCACCT 


TTGGGCCCTG 


TATTTTTAAG 


CTTCTTGTCA 


AATTTGTTTC 


CTCTAGGATC 


840 




GAAGCCATCA 


AGCTACAGAT 


GGTCTTACAA 


ATGGAACCCC 


AAATGAGTTC 


AACTAACAAC 


900 




TTCTACCAAG 


GACCCCTGGA 


ACGATCCACT 


GGCACTTCCA 


CTAGCCTAGA 


GATTCCCCTC 


960 




TGGAAGACAC 


TACAACTGCA 


GGGCCCCTTC 


TTTGCCCCTA 


TCCAGCAGGA 


AGTAGCTAGA 


1020 


20 


GCGGTCATCG 


GCCAAATTCC 


CAACAGCAGT 


TGGGGTGTCC 


TGTTTAGAGG 


GGGGATTGAA 


1080 




GAGGTGACAG 


CCTGCTGGCA 


GCCTCACAGC 


CCTCGTTGGY 


TCTCAGTGCC 


TCCTCAGCCT 


1140 




TGGTGCCCAC 


TCTGGCCGTG 


CTTGAGGAGC 


CCTTCAGCCT 


GCCACTGCAC 


TGTGGGAGCC 


1200 




TCTTTCTGGG 


CTGGACAAGG 


CCGGAGCCAG 


CTCCCTCAGC 


TTGCAGGGAG 


GTATGGAGGG 


1260 




AGAGATGCAG 


GCGGGAACCA 


GGGCTGCGCA 


TGGCGCTTGC 


GGGCCAGCAT 


GAGTTCCAGG 


1320 


25 


TGGGCGTGGG 


CTCGGCGGGC 


CCCACACTCG 


GGCAGTGAGG 


GGCTTAGCAC 


CTGGGCCAGA 


1380 




CAGATGCTGT 


GCTCAACTTC 


TTCGCTGGGC 


CTTAGCTGCC 


TTCCCCGTGG 


GGCAGGGCTY 


1440 




CGGGAACMTG 


CAGCCTGCCC 


ATGCTTGAGC 


CCCCCACCCC 


GCCGTGGGTT 


CYTGCACAGC 


1500 




CCAAGCTTCC 


CGGACAAGCA 


CCACCCCTTA 


TCCACGGTGC 


CCAGTCCCAT 


CAACCACCCA 


1560 




AGGGTTGAGG 


AGTGCGGGCA 


CACAGCGCGG 


GATTGGCAGG 


CAGTTCCACT 


TGCGGCCTTG 


1620 


30 


GTGCGGGATC 


CACTGCGTGA 


AGCCAGCTGG 


GCTCCTGAGT 


CTGGTGGGGA 


CTTGGAGAAT 


1680 




CTTTATGTCT 


AG CT AAGGG A 


TTGTAAATAC 


ACCAATC AG C 


AC 




1722 



35 



(2) INFORMATION FOR SEQ ID NO: 59: 
(i) SEQUENCE CHARACTERISTICS: 



WO 98/23755 PCT/IB9 7/0 1482 



10 



15 



20 



30 
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(A) LENGTH: 495 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



CTTCCCCAAC 


TAATAAGGAC 


CCCCCTTTCA 


ACCCAAACAG 


TCCAAAAGGA 


CATAGACAAA 


60 


GGAGTAAACA 


ATGAACCAAA 


GAGTGCCAAT 


ATTCCCTGGT 


TATGCACCCT 


CCAAGCGGTG 


120 


GGAGAAGAAT 


TCGGCCCAGC 


CAGAGTGCAT 


GTACCTTTTT 


CTCTCTCACA 


CTTGAAGCAA 


180 


ATTAAAATAG 


ACNTAGGTNA 


ATTNTCAGAT 


AGCCCTGATG 


GYTATATTGA 


TGTTTTACAA 


240 


GGATTAGGAC 


AATCCTTTGA 


TCTGACATGG 


AGAGATATAA 


TATTACTGCT 


AAATCAGACG 


300 


CTAACCTCAA 


ATGAGAGAAG 


TGCTGCCATA 


ACTGGAGCCC 


GAGAGTTTGG 


CAATCTCTGG 


360 


TATCTCAGTC 


AGGTCAATGA 


TAGGATGACA 


ACGGAGGAAA 


GAGAACGATT 


CCCCACAGGG 


420 


CAGCAGGCAG 


TTCCCAGTGT 


AGCTCCTCAT 


TGGGACACAG 


AATCAGAACA 


TGGAGATTGG 


480 


TGCCGCAGAC 


ATTTA 










495 



<2) INFORMATION FOR SEQ ID NO: 60: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2503 base pairs 
25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



{ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 





CCAAGAACCC 


ACCAATTCCG 


GANCACATTT 


TGGCGACCAC 


GAAGGGACTT 


TCGCATATCG 


60 




CCAAGCGGTG 


AGACAATAGC 


CGAGCGGTGA 


GACCTTTCCC 


AATCGCCAAG 


CAGTGAGTAC 


120 


35 


CATCAGACCC 


CTTTCACTTG 


CTATTCTGTC 


CTATCTTTCT 


TTAGAATTCG 


GGGGCTAAAT 


180 




ACCGGGCATC 


TGTCAGCCAT 


TTAAAAGTGA 


CTAGCGGGCC 


GCCGGACTAA 


AGACACGGGT 


240 
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GTCAAGCTTT 


CTGGGAAAGG 


GCTCTCTAAC 


AACCCCCAAC 


TCTTTGG AG T 


TGGGACCGTT 


300 




GGTTTGCCTA 


GAACCAGCTT 


CCGCTTTTCC 


TGTACTTCTG 


GGCTGAGCCG 


TGGGTTGACA 


360 




GTGAAGGAAA 


GCCATGCATC 


TCCGGGGTCT 


CGMCAACATG 


TTGGTTGACC 


CTG CGGCCAT 


420 




GAGTGGAACT 


CTCAAAAGCA 


TGTCGCCCAA 


GCGACACTCG 


CCTATCTATC 


CTATCTATCC 


480 


5 


TGACCCTTGC 


CCTCTGGGTC 


CTAATGCCTG 


CCAGACAAAC 


TTCCTCTCGC 


CTCTCTTCTC 


540 




TGAAGCTAGA 


ACCGCTTCTA 


AAAATTGCTA 


CCTGGTCTCT 


GGTGCTTTTC 


CTARTTTCTC 


600 




CTATAAAGAA 


TGAWTTCTAG 


TATTAAACTC 


CAGGACTCTG 


TTACCTTCTT 


TAGGCACCCG 


660 




GGCTCACCAA 


TCAGAAAGAC 


ACAG TTTTTG 


CCCAAGGCCC 


CATCGTAGTG 


GGGACTACCT 


720 




GGAATTTTAG 


GATCCCTCCT 


CAGACTAACA 


GGCCTAACAA 


AAGTTATTCC 


TGAAGCTAGG 


780 


10 


ATATGGGGAG 


CCTCAGAAAT 


TGTATCCCTC 


CTATTCATAT 


AAGTGAGAAC 


AAAAGGTGTC 


840 




ACTCTTCCAA 


CCCTGAAGAT 


CCCCTCCCTC 


CCTCAGGGTA 


TGGCCCTCCA 


TTTCATTTTT 


900 




GTGGCATAAC 


ATCTTTATAG 


GATGGGGTAA 


AGTCCCAATA 


CTAACAGGAG 


AATGCTTAGG 


960 




ACTCTAACAG 


GTTTTTGAGA 


ATGCGTCAGT 


AAGGGCCACT 


AAATCTGATT 


TTTCTCAGTC 


1020 




GGTCCTCCTT 


GTGGTCTAGG 


AGGACAGGCA 


AGGTTGTGCA 


GGTTTTCGAG 


AATGCGTCAG 


1080 


15 


TAAGGACCAC 


TAAATCCGAC 


CTTCCTCGGT 


CCTCCATGTG 


GTCTGGGAGG 


AAAACTAGTG 


1140 




TTTCTGCTGC 


TGCGTCGGTG 


AGCGCAACTA 


TTCAAGTCAG 


CAGGGTCCAG 


GGACCGTTGC 


1200 




AGGTTCTTGG 


GCAGGGGTTG 


TTTCTGCTGC 


TGCATTGGTG 


AATGCAACTA 


TTCTGATCAG 


1260 




CAGGGTCCCA 


GGACCATTGC 


AGGTCCTTGG 


GCAGGGAGAG 


AAACAAAACA 


AACCAAAACT 


1320 




GTGGGCGGTT 


TTGTCTTTCA 


TATGGGAAAC 


ACTCAGGCAT 


CAACAGGTTC 


ACCCTTGAAA 


1380 


20 


TGCATCCTAA 


GCCATTGGGA 


CCAATTTGAC 


CCACAAACCC 


TGAAAAAGAG 


GAGGCTCATT 


1440 




TTTTCCTGCA 


CTACGGCTTG 


GCCCCAATAT 


TCTCTTTYTG 


ATGGGGAAAA 


ATGGCCACCT 


1500 




GAGGGAAGCA 


CAAATTACAA 


TAYTATCCTA 


CAGCYTGATC 


TTTTCTGTAA 


GAGGGAAGGC 


1560 




AAATGGAGTG 


AATACCTTAT 


GTCCAAGCTT 


TCTTTTCATT 


GAGGGAGAAT 


ACACAACTAT 


1620 




GCAAAGCTTG 


CAATTTACAT 


CCCACAGGAG 


GACCCTTCAG 


CTTACCCCCA 


TATCCTAGCC 


1680 


25 


TCCCTATAGC 


TTCCCTTCCT 


ATTGATGATA 


CTCCTCCTCT 


AATCTCCCCT 


GCCCAGAAGG 


1740 




AAATAAGCAA 


AGAAATCTCC 


AAAGGTCCAC 


AAAAACCCCC 


GGGCTATCGG 


TTATGTCCCT 


1800 




TCAAGYTGTA 


GGGGGAGGGG 


AATTTGGCCC 


AACCCGGGTG 


CATGTCCCTT 


CTCCCTCTCT 


1860 




GATTTAAAGC 


AGATCAAGGC 


AGACCTGGGG 


AAGTTTTCAG 


ATGATCCTGA 


TAGGTACATA 


1920 




GATGTCCTAC 


AGGGTCTAGG 


GCAAACCTTT 


GACCTCACTT 


GGAGAGACGT 


CATGCTACTG 


1980 


30 


TTAGATCAAA 


CCCTGGCCTT 


TAATGAAAAG 


AATGCGGCTT 


TAGCTGCAGC 


CTGAGAGTTT 


2040 




GGAGATACCT 


GGTATCCTAG 


TCAAGTAAAT 


GAAAGAATGA 


CAGCCGAAGA 


AAGGGACAAC 


2100 




TTCCTTACTG 


GTCAGCAACC 


CATCCCCAGT 


ATGGATCCCC 


ACTGGGACTT 


TGACTCAGAT 


2160 




CATGGGGACT 


GGAGTCGTAA 


ACATCTGTTG 


ATCTGTGTTC 


TGGAAGGACT 


AAGGAGAATT 


2220 




GGGAAAAAGC 


CCATGAATTA 


TTCAATGATA 


TCCACCATAA 


CCCAGGGAAA 


GGAAGAAAAT 


2280 


35 


CCTTCTGCCT 


TCCTCGAGCG 


GCTACAAGAG 


GCCTTAAGAA 


AATATACTCC 


CCTGTCACCC 


2340 




GAATCACTCG 


AGGGTCAATT 


GATTCTAAAA 


GATAAGTTTA 


TTACCCAATC 


AGCCACAGAT 


2400 
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ATCAGGAGAA AGCTCCAAAA GCAAGCCCTG 
ACCTGGCAAC CTTGGTGTTC TATAATAGGG 
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AGCCTGAACA AAATCTAGAG ACATTATTAA 2460 
ACCAAGAGGA ACA 2503 



5 (2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1167 base pairs 

( B ) TYPE: nucleotide 

10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 



AAGGAAACTC 


AGAAAGCCAA 


TACCCATTTA 


GTAAGATGGA 


CACCAGAAGC 


AGAAGCAGCT 


60; 


TTCCAGGCCC 


TAAAGAAATC 


CCTAACCCAA 


GCCCCAGTGT 


TAAGCTTGCC 


AACGGGGCAA 


120 


GACTTTTCTT 


TATATGTCAC 


AGAAAAACAG 


GAATAGCTCT 


AGGAGTCCTT 


ACACAGGTCC 


180 


AAGGGACAAG 


CTTGCAACCT 


GTGGCATACC 


TGAGTAAGGA 


AACTGATGTA 


NTGGCAAAGG 


240 


GTTGGCCTCA 


TTGTTTACAG 


GTAGGGCAGC 


AGTAGCAGTC 


TTAGTTTCTG 


AAACAGTTAA 


300 


AATAATACAG 


GGAAGAGATC 


TTACTGTGTG 


GACATCTCAT 


GATGTGAACG 


GCATACTCAC 


360 


TGCTAAAGAG 


GACTTGTGGC 


TGTCAGACAA 


CCATTTACTT 


AAATAGCAGG 


TTCTATTACT 


420 


TGAAGTGCCA 


GTGCTGCGAC 


TGCACATTTG 


TGCAACTCTT 


AACCCAGCCA 


CATTTCTTCC 


480 


AGACAATGAA 


GAAAAGATAG 


AACATAACTG 


TCAACAAGTA 


ATTGCTCAAA 


CCTATGCTGC 


540 


TCGAGGGGAC 


CTTCTAGAGG 


TTCCCTTGAC 


TGATCCCGAC 


CTCAACTTGT 


ATACTGATGG 


600 


AAGTTCCTTG 


GCAGAAAAAG 


GACTTTGAAA 


AGCGGGGTAT 


GCAGTGATCA 


GTGATAATGG 


660 


AATACTTGAA 


AGTAATCGCC 


TCACTCCAGG 


AACTAGTGCT 


CACCTGGCAG 


AACTAATAGC 


720 


CCTCACTTGG 


GCACTAGAAT 


TAGGAGAAGG 


AAAAAGGGTA 


AATATATATT 


CAGACTCTAA 


780 


GTATGCTTAC 


CTAGTCCTCC 


ATGCCCATGC 


AGCAATATGG 


AGAGAGAGGG 


AATTCCTAAC 


840 


TTCTGAGGGA 


ACACCTATCA 


ACCATCAGGG 


AAGCCATTAG 


GAGATTATTA 


TTGGCTGTAC 


900 


AGAAACCTAA 


AGAGGTGGCA 


GTCTTACACT 


GCCAGGGTCA 


TCAGGAAGAA 


GAGGAAAGGG 


960 


AAATAGAAGG 


CAATCGCCAA 


GCGGATATTG 


AAGCAAAAAA 


AGCCGCAAGG 


CAGGACTCTC 


1020 


CATTAGAAAT 


GCTTATAGAA 


GGACCCCTAG 


TATGGGGTAA 


TCCCCTCTGG 


GAAACCAAGC 


1080 


CCCAGTACTC 


AGCAGGAAAA 


ATAGAATAGG 


AAACCTCACA 


AGGACATACT 


TTCCTCCCCT 


1140 


CCAGATGGCT 


AGCCACTGAG 


GAAGGAA 








1167 
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(2) INFORMATION FOR SEQ ID NO: 62: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 78 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

15 TCCAAAGGCA CCAGGGCCCT CAGTGAGGAA CGTATCCAGC CTATACTGGC TTATCCTCAT 60 
CCCAAAACCC TAAAGCAA 78 



(2) INFORMATION FOR SEQ ID NO: 63 

20 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 26 amino acids 

(B) TYPE : amino acid 

25 (ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 63 

Ser Lys Gly Thr Arg Ala Leu Ser Glu Glu Arg lie Gin Pro lie Leu 

30 1 5 10 15 

Ala Tyr Pro His Pro Lys Thr Leu Lys Gin 

20 25 



3 5 (2) INFORMATION FOR SEQ ID NO: 64: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
AAATGTCTGC GGCACCAATC TCCATGTT 28 



(2) INFORMATION FOR SEQ ID NO: 65: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
2 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

25 

AAGGGGCATG GACGAGGTGG TGGCTTATTT 30 



(2) INFORMATION FOR SEQ ID NO: 66: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
3 5 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: CDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GGAGAAGAGC AGCATAAGTG G 21 

5 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
GTGCTGATTG GTGTATTTAC AATCC 2 5 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 68: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 
25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
GACTCGCTGC AGATCGATTT TTTTTTTTTT TTTT 34 



3 5 (2) INFORMATION FOR SEQ ID NO: 69: 
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<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

10 

GCCATCAAGC CACCCAAGAA CTCTTAACTT 30 



(2) INFORMATION FOR SEQ ID NO: 70: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

( B ) TYPE: nucleotide 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
2 5 CCAATAGCCA GACCATTATA TACACTAATT 30 



(2) INFORMATION FOR SEQ ID NO: 71: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
GCCATAACTG CAACCCAAGA GTT 

5 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

10 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
GGACGAGGTG GTGGCTTATT TCT 

20 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 
25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
AACTTGCGTG CTAGAAGGAC TAAGG 



35 

(2) INFORMATION FOR SEQ ID NO: 74: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

5 (C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 i near 

(ii) MOLECULE TYPE: cDNA 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

AACTTTTCCC TTTTCCAGAT CCTC 24 



15 (2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

2 0 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

2 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

GCATACCAGG CAAGTGGACA TT 22 



3 0 (2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

5 

CTGTCCGTTG GGTTTCCTTA CTCCT 25 



(2) INFORMATION FOR SEQ ID NO: 77: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

20 

GAGGCTCTGG AAAAGGGAAA AGTT 24 



(2) INFORMATION FOR SEQ ID NO: 78: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
3 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

35 

CTGTCCGTTG GGTTTCCTTA CTCCT 2 5 
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(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



(ii) MOLECULE TYPE: CDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

15 AGGAGTAAGG AAACCCAACG GACAG 2 5 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
TGTATATAAT GGTCTGGCTA TTGGG 2 5 

30 

(2) INFORMATION FOR SEQ ID NO: 81: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
AGGAGTAAGG AAACCCAACG GACAG 25 

10 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
TTCGGCAGAA ACCTGTTATG CCAAGG 26 

25 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
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CTCGATTTCT TGCTGGGCCT TA 



22 



5 (2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 



GTTGATTCCC TCCTCAAGCA 



20 



2 0 (2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 



CTCTACCAAT CAGCATGTGG 



20 



3 5 (2) INFORMATION FOR SEQ ID NO: 86: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

10 

TGTTCCTCTT GGTCCCTAT 19 



(2) INFORMATION FOR SEQ ID NO: 87: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 aminoacids 

(B) TYPE: aminoacid 

2 0 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
Met Ala Thr Ala Thr Gly Thr Gly lie Ala Gly Leu Ser Thr Ser Leu 
15 10 15 

25 Ser Tyr Tyr His Thr Leu Ser Lys Asn Phe Ser Asp Ser Leu Gin Glu 

20 25 30 

lie Met Lys Ser lie Leu Thr Leu Gin Ser Gin Leu Asp Ser Leu Ala 

35 40 45 

Ala Met Thr Leu Gin Asn Arg Arg Gly Pro His Leu Leu Thr Ala Glu 
30 50 55 60 

Lys Gly Gly Leu Cys Thr Phe Leu Gly Glu Glu Cys Cys Phe Tyr Thr 
65 70 75 80 

Asn Gin Ser Gly lie Val Arg Asp Ala Thr Trp His Leu Gin Glu Arg 
85 90 95 

35 Ala Ser Asp lie Arg Gin Cys Leu Ser Asn Ser Tyr Thr Asn Leu Trp 

100 105 110 
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Ser 
Leu 

5 Phe 
145 
Met 

Glu 

10 

Thr 
Ala 

15 Phe 
225 
Pro 

Cys 

20 

Trp 
Gly 

25 Ala 
305 
Gly 

Ser 

30 

Xaa 
His 

35 Ser 
385 



Trp Ala Thr 
115 

Leu Leu Leu 
130 

Val Ser Ser 

Glu Pro Gin 

Arg Ser Thr 
180 

Leu Gin Leu 
195 

Arg Ala Val 
210 

Arg Gly Gly 

Arg Trp Xaa 

Leu Arg Ser 
260 

Ala Gly Gin 
275 

Gly Arg Asp 
290 

Ser Met Ser 

Ser Glu Gly 

Ser Leu Gly 
340 

Ala Ala Cys 

355 
Ser Pro Ser 
370 

Pro lie Asn 



Trp Leu Leu 

Thr Phe Gly 
135 

Arg lie Glu 

150 
Met Ser Ser 
165 

Gly Thr Ser 

Gin Gly Pro 

lie Gly Gin 
215 

lie Glu Glu 

230 
Ser Val Pro 
245 

Pro Ser Ala 

Gly Arg Ser 

Ala Gly Gly 
295 

Ser Arg Trp 

310 
Leu Ser Thr 
325 

Leu Ser Cys 

Pro Cys Leu 

Phe Pro Asp 
375 

His Pro Arg 
390 



169 

Pro Phe Leu 
120 

Pro Cys lie 

Ala lie Lys 

Thr Asn Asn 
170 

Thr Ser Leu 
185 

Phe Phe Ala 
200 

lie Pro Asn 

Val Thr Ala 

Pro Gin Pro 
250 

Cys His Cys 

265 
Gin Leu Pro 
280 

Asn Gin Gly 

Ala Trp Ala 

Trp Ala Arg 
• 330 

Leu Pro Arg 

345 
Ser Pro Pro 
360 

Lys His His 
Val Glu Glu 



Gly Pro Met 
125 

Phe Lys Leu 
140 

Leu Gin Met 
155 

Phe Tyr Gin 

Glu lie Pro 

Pro He Gin 
205 

Ser Ser Trp 

220 
Cys Trp Gin 
235 

Trp Cys Pro 

Thr Val Gly 

Gin Leu Ala 
285 

Cys Ala Trp 

300 
Arg Arg Ala 
315 

Gin Met Leu 

Gly Ala Gly 

Pro Arg Arg 
365 

Pro Leu Ser 

380 
Cys Gly His 
395 



Ala Ala He 

Leu Val Lys 

Val Leu Gin 
160 

Gly Pro Leu 
175 

Leu Trp Lys 
190 

Gin Glu Val 

Gly Val Leu 

Pro His Ser 
240 

Leu Trp Pro 

255 
Ala Ser Phe 
270 

Gly Arg Tyr 

Arg Leu Arg 

Pro His Ser 
320 

Cys Ser Thr 

335 
Leu Arg Glu 
350 

Gly Phe Leu 

Thr Val Pro 

Thr Ala Arg 
400 
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Asp Trp Gin Ala Val Pro Leu Ala Ala Leu Val Arg Asp Pro Leu Arg 

405 410 415 

Glu Ala Ser Trp Ala Pro Glu Ser Gly Gly Asp Leu Glu Asn Leu Tyr 
420 425 430 

5 Val 
433 



(2) INFORMATION FOR SEQ ID NO: 88: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 693 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

20 



CTTCCCCAAC 


TAATAAGGAC 


CCCCCTTTCA 


ACCCAAACAG 


TCCAAAAGGA 


CATAGACAAA 


60 


GGAGTAAACA 


ATGAACCAAA 


GAGTGCCAAT 


ATTCCCTGGT 


TATGCACCCT 


CCAAGCGGTG 


120 


GGAGAAGAAT 


TCGGCCCAGC 


CAGAGTGCAT 


GTACCTTTTT 


CTCTCTCACA 


CTTGAAGCAA 


180 


ATTAAAATAG 


ACNTAGGTNA 


ATTNTCAGAT 


AGCCCTGATG 


GYTATATTGA 


TGTTTTACAA 


240 


GGATTAGGAC 


AATCCTTTGA 


TCTGACATGG 


AGAGATATAA 


TATTACTGCT 


AAATCAGACG 


300 


CTAACCTCAA 


ATGAGAGAAG 


TGCTGCCATA 


ACTGGAGCCC 


GAGAGTTTGG 


CAATCTCTGG 


360 


TATCTCAGTC 


AGGTCAATGA 


TAGGATGACA 


ACGGAGGAAA 


GAGAACGATT 


CCCCACAGGG 


420 


CAGCAGGCAG 


TTCCCAGTGT 


AGCTCCTCAT 


TGGGACACAG 


AATCAGAACA 


TGGAGATTGG 


480 


TGCCGCAGAC 


ATTTACTAAC 


TTGCGTGCTA 


GAAGGACTAA 


GGAAAACTAG 


GAAGACTATG 


540 


AATTATTCAA 


TGATGTCCAC 


TATAACACAG 


GGGAAAGGAA 


GAAAATCCTA 


CTGCCTTTCT 


600 


GGAGAGACTA 


AGGGAGGCAT 


TGAGGAAGCA 


TACCAGGCAA 


GTGGACATTG 


GAGGCTCTGG 


660 


AAAAGGGAAA 


AGTTGGGCAA 


ATTGAATGCC 


TAA 






693 



35 (2) INFORMATION FOR SEQ ID NO: 89: 
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(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 1577 base pairs 



(B) TYPE: nucleotide 



(C) STRANDEDNESS : single 



(D) TOPOLOGY: linear 



10 



15 



20 



25 



30 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 



AACTTGCGTG 


CTAGAAGGAC 


TAAGGAAAAC 


TAGGAAGACT 


ATGAATTATT 


CAATGATGTC 


60 


CACTATAACA 


CAGGGGAAAG 


GAAGAAAATC 


CTACTGCCTT 


TCTGGAGAGA 


CTAAGGGAGG 


120 


CATTGAGGAA 


GCATACCAGG 


CAAGTGGACA 


TTGGAGGCTC 


TGGAAAAGGG 


AAAAGTTGGG 


180 


CAAATTGAAT 


GCCTAATAGG 


GCTTGCTTCC 


AGTGCAGTCT 


ACAAGGACGC 


TTTAGAAAAG 


240 


ATTGTCCAAG 


TAGAAATAAG 


CCGCCCCTCG 


TCCATGCCCC 


TTATGTCAAG 


GGAATCACTG 


300 


GAAGGCCTAC 


TGCCCCAGGG 


GACGAAGGTC 


CTCTGAGTCA 


GAAGCCACTA 


ACCTGATGAT 


360 


CCAGCAGCAG 


GACTGAGGGT 


GCCCGGGGCA 


AGTGCCAGCC 


CATGCCATCA 


CCCTCAGAGC 


420. 


CCCGGGTATG 


TTTGACCATT 


GAGAGCCAGG 


AAGTTAACTG 


TCTCCTGGAC 


ACTGGCGCAG 


480 


CCTTCTCAGT 


CTTACTTTCC 


TGTCCCAGAC 


AATTGTCCTC 


CAGATCTGTC 


ACTATCCGAG 


540 



GGGTCCTAAG 


ACAGCCAGTC 


ACTACATACT 


TCTCTCAGCC 


ACTAAGTTGT 


GACTGGGGAA 


600 


CTTTACTCTT 


TTCACATGCT 


TTTCTAATTA 


TGCCTGAAAG 


CCCCACTCCC 


TTGTTAGGGA 


660 


GAGACATTTT 


AGCAAAAGCA 


GGGGCCATTA 


TACACCTGAA 


CATAGGAAAA 


GGAATACCCA 


720 


TTTGCTGTCC 


CCTGCTTGAG 


GAAGGAATTA 


ATCCTGAAGT 


CTGGGCAATA 


GAAGGACAAT 


780 


ATGGACAAGC 


AAAGAATGCC 


CGTCCTGTTC 


AAGTTAAACT 


AAAGGATTCT 


GCCTCCTTTC 


840 


CCTACCAAAG 


GAAGTACCCT 


CTTAGACCCG 


AGGCCCTACA 


AGGACTCAAA 


AGATTGTTAA 


900 


GGACCTAAAA 


GCCCAAGGCC 


TAGTAAAACC 


ATGCAGTAGC 


CCCTGCAATA 


CTCCAATTTT 


960 


AGGAGTAAGG 


AAACCCAACG 


GACAGTGGAG 


GTTAGTGCAA 


GATCTCAGGA 


TTATTAATGA 


1020 


GGCTGTTTTT 


CCTCTATACC 


CAGCTGTATC 


TAGCCCTTAT 


ACTCTGCTTT 


CCCTAATACC 


1080 


AGAGGAAGCA 


GAGTAGTTTA 


CAGTCCTGGA 


CCTTAAGGAT 


GCCTCTTTCT 


GCATCCCTGT 


1140 


ACATCCTGAT 


TCTCAATTCT 


TGTTTGTCTT 


TGAAGATCCT 


TTGAACCCAA 


TGTCTCAATT 


1200 


CACCTGGACT 


GTTTTACCCC 


AGGGGTTCCG 


GGATAGCCCC 


CATCTATTTG 


GCCAGGCATT 


1260 


AGCCCAAGAC 


TTGAGCCAAT 


TCTCATACCT 


GGACATCTTG 


TCCTTCGGTA 


TGGGATGATT 


1320 


TAATTTTAGC 


CACCCGTTCA 


GAAACCTTGT 


GCCATCAAGC 


CACCCAAGCG 


TTCTTAAATT 


1380 


TCCTCACTCC 


GTGTGGCTAC 


AAGGTTTCCA 


AACCAAAGGC 


TCAGCTCTGC 


TCACAGCAGG 


1440 


TTAAATACTT 


AGGGTTAAAA 


TTATCCAAAG 


GCACCAGGGC 


CCTCTGTGAG 


GAATGTATCC 


1500 


AACCTGTACT 


GGCTTATCTT 


CATCCCAAAA 


CCCTAAAGCA 


ACTAAGAAGG 


TCCTTGGCAT 


1560 
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AACAGGTTTC TGCCGAA 157 7 



(2) INFORMATION FOR SEQ ID NO: 90: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 amino acids 

(B) TYPE: amino acid 

10 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Ser Ser Ser Arg Thr Glu Gly Ala Arg Gly Lys Cys Gin Pro Met Pro 
15 1 5 10 15 

Ser Pro Ser Glu Pro Arg Val Cys Leu Thr lie Glu Ser Gin Glu Val 

20 25 30 

Asn Cys Leu Leu Asp Thr Gly Ala Ala Phe Ser Val Leu Leu Ser Cys 
35 40 45 

20 Pro Arg Gin Leu Ser Ser Arg Ser Val Thr He Arg Gly Val Leu Arg 

50 55 60 

Gin Pro Val Thr Thr Tyr Phe Ser Gin Pro Leu Ser Cys Asp Trp Gly 
65 70 75 80 

Thr Leu Leu Phe Ser His Ala Phe Leu He Met Pro Glu Ser Pro Thr 
25 85 90 95 

Pro Leu Leu Gly Arg Asp He Leu Ala Lys Ala Gly Ala lie He His 

100 105 110 

Leu Asn He Gly Lys Gly He Pro He Cys Cys Pro Leu Leu Glu Glu 
115 120 125 

30 Gly He Asn Pro Glu Val Trp Ala He Glu Gly Gin Tyr Gly Gin Ala 

130 135 140 

Lys Asn Ala Arg Pro Val Gin Val Lys Leu Lys Asp Ser Ala Ser Phe 
145 150 155 160 

Pro Tyr Gin Arg Lys Tyr Pro Leu Arg Pro Glu Ala Leu Gin Gly Leu 
35 165 170. 175 

Lys Arg Leu Leu Arg Thr 
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180 



(2) INFORMATION FOR SEQ ID NO: 91: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

15 

AGATCTGCAG AATTCGATAT CACCCCCCCC CCCCCC 



(2) INFORMATION FOR SEQ ID NO: 92: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

( B ) TYPE: nucleotide 

(C) STRANDEDNESS: single 
2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 



30 



AGATCTGCAG AATTCGATAT CA 



(2) INFORMATION FOR SEQ ID NO: 93: 
(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 2304 base pairs 

(B) TYPE: nucleotide 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

5 TCCAGCAGCA GGACTGAGGG TGCCCGGGGC AAGTGCCAGC CCATGCCATC 50 

ACCCTCAGAG CCCCGGGTAT GTTTGACCAT TGAGAGCCAG GAAGTTAACT 100 

GTCTCCTGGA CACTGGCGCA GCCTTCTCAG TCTTACTTTC CTGTCCCAGA 150 

CAATTGTCCT CCAGATCTGT CACTATCCGA GGGGTCCTAG GACAGCCAGT 200 

CACTACATAC TTCTCTCAGC CACTAAGTTG TGACTGGGGA ACTTTACTCT 250 

10 TTTCACATGC TTTTCTAATT ATGCCTGAAA GCCCCACTCC CTTGTTAGGG 300 

AGAGACATTT TAGCAAAAGC AGGGGCCATT ATACACCTGA ACATAGGAAA 350 

AGGAATACCC ATTTGCTGTC CCCTGCTTGA GGAAGGAATT AATCCTGAAG 400 

TCTGGGCAAT AGAAGGACAA TATGGACAAG CAAAGAATGC CCGTCCTGTT 450 

CAAGTTAAAC TAAAGGATTC TGCCTCCTTT CCCTACCAAA GGAAGTACCC 500 

15 TCTTAGACCC GAGGCCCTAC AAGGANCTCA AAAGATTGTT AAGGACCTAA 550 

AAGCCCAAGG CCTAGTAAAA CCATGCAGTA GCCCCTGCAA TACTCCAATT 600 

TTAGGAGTAA GGAAACCCAA CGGACAGTGG AGGTTAGTGC AAGATCTCAG 650 

GATTATTAAT GAGGCTGTTT TTCCTCTATA CCCAGCTGTA TCTAGCCCTT 700 

ATACTCTGCT TTCCCTAATA CCAGAGGAAG CAGAGTGGTT TACAGTCCTG 750 

2 0 GACCTTAAGG ATGCCTTTTT CTGCATCCCT GTACGTCCTG ACTCTCAATT 800 

CTTGTTTGCC TTTGAAGATC CTTTGAACCC AACGTCTCAA CTCACCTGGA 850 

CTGTTTTACC CCAAGGGTTC AGGGATAGCC CCCATCTATT TGGCCAGGCA 900 

TTAGCCCAAG ACTTGAGTCA ATTCTCATAC CTGGACACTC TTGTCCTTCA 950 

GTACGTGGAT GATTTACTTT TAGTCGCCCG TTCAGAAACC TTGTGCCATC 1000 

2 5 AAGCCACCCA AGAACTCTTA ACTTTCCTCA CTACCTGTGG CTACAAGGTT 1050 

TCCAAACCAA AGGCTCGGCT CTGCTCACAG GAGATTAGAT ACTTAGGGCT 1100 

AAAATTATCC AAAGGCACCA GGGCCCTCAG TGAGGAACGT ATCCAGCCTA 1150 

TACTGGCTTA TCCTCATCCC AAAACCCTAA AGCAACTAAG AGGGTTCCTT 1200 

GGCATAACAG GTTTCTGCCG AAAACAGATT CCCAGGTACA CCCCAATAGC 1250 

3 0 CAGACCATTA TATACACTAA TTAGGGAAAC TCAGAAAGCC AATACCTATT 1300 

TAGTAAGATG GACACCTACA GAAGTGGCTT TCCAGGCCCT AAAGAAGGCC 1350 

CTAACCCAAG CCCCAGTGTT CAGCTTGCCA ACAGGGCAAG ATTTTTCTTT 1400 

ATATGCCACA GAAAAAACAG GAATAGCTCT AGGAGTCCTT ACGCAGGTCT 1450 

CAGGGATGAG CTTGCAACCC GTGGTATACC TGAGTAAGGA AATTGATGTA 1500 

3 5 GTGGCAAAGG GTTGGCCTCA TTGTTTATGG GTAATGGCGG CAGTAGCAGT 1550 

CTTAGTATCT GAAGCAGTTA AAATAATACA GGGAAGAGAT CTTACTGTGT 1600 
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15 



GGACATCTCA 


TGATGTGAAC 


GGCATACTCA 


CTG CTAAAGG 


AGACTTGTGG 


1650 


TTGTCAGACA 


ACCATTTACT 


TAATTATCAG 


GCTCTATTAC 


TTGAAGAGCC 


1700 


AGTGCTGAGA 


CTGCGCACTT 


GTGCAACTCT 


TAAACCCGCC 


ACATTTCTTC 


1750 


CAGACAATGA 


AGAAAAGATA 


GAACATAACT 


GTCAACAAGT 


AATTGCTCAA 


1800 


ACCTATGCTG 


CTCGAGGGGA 


CCTTCTAGAG 


GTTCCCTTGA 


CTGATCCCGA 


1850 


CCTCAACTTG 


TATACTGATG 


GAAGTTCCTT 


GGCAGAAAAA 


GGACTTCGAA 


1900 


AAGCGGGGTA 


TGCAGTGATC 


AGTGATAATG 


GAATACTTGA 


AAGTAATCGC 


1950 


CTCACTCCAG 


GAACTAGTGC 


TCACCTGGCA 


GAACTAATAG 


CCCTCACTTG 


2000 


GGCACTAGAA 


TTAGGAGAAG 


GAAAAAGGGT 


AAATATATAT 


TCAGACTCTA 


2050 


AGTATGCTTA 


CCTAGTCCTC 


CATGCCCATG 


CAGCAATATG 


GAGAGAGAGG 


2100 


GAATTCCTAA 


CTTCTGAGGG 


AACACCTATC 


AACCATCAGG 


AAGCCATTAG 


2150 


GAGATTATTA 


TTGGCTGTAC 


AGAAACCTAA 


AGAGGTGGCA 


GTCTTACACT 


2200 


GCCAGGGTCA 


TCAGGAAGAA 


GAGGAAAGGG 


AAATAGAAGG 


CAATCGCCAA 


2250 


GCGGATATTG 


AAGCAAAAAA 


AGCCGCAAGG 


CAGGACTCTC 


CATTAGAAAT 


2300 


GCTT 










2304 



(2) INFORMATION FOR SEQ ID NO: 94: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2364 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 



ATGATCCAGC 


AGCAGGACNG 


AGGGTGCCCG 


GGGCAAGCGC 


CAGCCCATGC 


50 


CATCACCCTC 


ACAGAGCCCC 


AGGTATGCTT 


GACCATTGAG 


GGTCAGAAGG 


100 


GTNACTGTCT 


CCTGGACACT 


GGCGGNGCCT 


TCTCAGTCTT 


ACTTTCCTGT 


150 


CCTGGACAAC 


TGTCCTCCAG 


ATCTGTCACT 


GTCCGAGGGG 


TCCTAGGACA 


200 


GCCAGTCACT 


AGATACTTCT 


CCCAGCCACT 


AAGTTGTGAC 


TGGGGAACTT 


250 


TACTCTTCCC 


ACATGCTTTT 


CTAATTATGC 


CTGAAAGCCC 


CACTCTCTTG 


300 


TTGGGGAGAG 


ACATTCTAGC 


AAAAGCAGGG 


G CC ATT ATA C 


ATGTGAATAT 


350 


AGGAGAAGGA 


ACAACTGTTT 


GTTGTCCCCT 


GCTTGAGGAA 


GGAATTAATC 


400 


CTGAAGTCCG 


GGCAACAGAA 


GGACAATATG 


GACAAGCAAA 


GAATGCCCGT 


450 


CCTGTTCAAG 


TTAAACTAAA 


GGATTCCACC 


TCCTTTCCCT 


ACCAAAGGCA 


500 


GTACCCCCTC 


AGACCCGAGA 


CCCAACAAGA 


ACTCCAAAAG 


ATTGTAAAGG 


550 


ACCTAAAAGC 


CCAAGGCCTA 


GTAAAACCAA 


GCAATAGCCC 


TTGCAAGACT 


600 
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CCAATTTTAG 


GAGTAAGGAA 


ACCCAACGGA 


CAGTGGAGGT 


TAGTGCAAGA 


650 




ACTCAGGATT 


ATCAATGAGG 


CTGTTGTTCC 


TCTATACCCA 


GCTGTACCTA 


700 




ACCCTTATAC 


AGTGCTTTCC 


CAAATACCAG 


AGGAAGCAGA 


GTGGTTTACA 


750 




GTCCTGGACC 


TTAAGGATGC 


CTTTTTCTGC 


ATCCCTGTAC 


GTCCTGACTC 


800 


5 


TCAATTCTTG 


TTTGCCTTTG 


AAGATCCTTT 


GAACCCAACG 


TCTCAACTCA 


850 




CCTGGACTGT 


TTTACCCCAA 


GGGTTCAGGG 


ATAGCCCCCA 


TCTATTTGGC 


900 




CAGGCATTAG 


CCCAAGACTT 


GAGTCAATTC 


TCATACCTGG 


ACACTCTTGT 


950 




CCTTCAGTAC 


ATGGATGATT 


TACTTTTAGT 


CGCCCGTTCA 


GAAACCTTGT 


1000 




GCCATCAAGC 


CACCCAAGAA 


CTCTTAACTT 


TCCTCACTAC 


CTGTGGCTAC 


1050 


10 


AAGGTTTCCA 


AACCAAAGGC 


TCGGCTCTGC 


TCACAGGAGA 


TTAGATACTN 


1100 




AGGGCTAAAA 


TTATCCAAAG 


GCACCAGGGC 


CCTCAGTGAG 


GAACGTATCC 


1150 




AGCCTATACT 


GGCTTATCCT 


CATCCCAAAA 


CCCTAAAGCA 


ACTAAGAGGG 


1200 




TTCCTTGGCA 


TAACAGGTTT 


CTGCCGAAAA 


CAGATTCCCA 


GGTACASCCC 


1250 




AATAGCCAGA 


CCATTATATA 


CACTAATTAN 


GGAAACTCAG 


AAAGCCAATA 


1300 


15 


CCTATTTAGT 


AAGATGGACA 


CCTACAGAAG 


TGGCTTTCCA 


GGCCCTAAAG 


1350 




AAGGCCCTAA 


CCCAAGCCCC 


AGTGTTCAGC 


TTGCCAACAG 


GGCAAGATTT 


1400 




TTCTTTATAT 


GCCACAGAAA 


AAACAGGAAT 


AGCTCTAGGA 


GTCCTTACGC 


1450 




AGGTCTCAGG 


GATGAGCTTG 


CAACCCGTGG 


TATACCTGAG 


TAAGGAAATT 


1500 




GATGTAGTGG 


CAAAGGGTTG 


GCCTCATNGT 


TTATGGGTAA 


TGGNGGCAGT 


1550 


20 


AGCAGTCTNA 


GTATCTGAAG 


CAGTTAAAAT 


AATACAGGGA 


AGAGATCTTN 


1600 




CTGTGTGGAC 


ATCTCATGAT 


GTGAACGGCA 


TACTSRCTGC 


TAAAGGAGAC 


1650 




TTGTGGTTGT 


CAGACAACCA 


TTTACTTAAN 


TAYCAGGCYY 


TATTACTTGA 


1700 




AGAGCCAGTG 


CTGNGACTGC 


GCACTTGTCC 


AACTCTTAAA 


CCCAAACTTA 


1750 




TGCTGCCCAG 


AAGGATCTTT 


NTAGAGGTCC 


CCTTAGCCAA 


CCCTGACCTC 


1800 


25 


AACTATATAT 


ATACTGATGG 


AAGTTCGTTT 


GTAGAAAAGG 


GATTACAAAG 


1850 




GGNAGGATAT 


NCCATAGGTG 


TTAGTGATAA 


AGCAGTACTT 


GAAAGTAAGC 


1900 




CTCTTCCCCC 


CCAGGGACCA 


GCGCCCCCGT 


TAGCAGAACT 


AGTGGCACTG 


1950 




ACCCCGCGAG 


CCTTAGAACT 


TTGGAAAGGG 


AGGAGGATAA 


ATGTGTATAC 


2000 




AGATAGCAAG 


TATGCTTATC 


TAATCCGAAA 


TGCCCATGTT 


GCAATATGGA 


2050 


30 


AAGAAAGGGA 


GTTCCTAACC 


TCTGGGGGAA 


CCCCCATTAA 


ATACCACAAG 


2100 




TTAATCATGG 


AGTTATTGCA 


CACAGTGCAA 


AAACTCAAGG 


AGGTGGAAGT 


2150 




CTTACACTGC 


CAAAGCCATC 


AGAAAAGGGA 


AAGAGGGGAA 


GAGCAGCATA 


2200 




AGTGGCTACA 


GAGGCAAGGA 


AAGACTAGCA 


GAAAGGAAAG 


AGAGAAAGAG 


2250 




ACAGAAAGTC 


AGAGAGAGAG 


AGAGGAAGAG 


ACAGAGCACA 


AAGAGGGAGT 


2300 


35 


CAGAGAGAGA 


GAGAGACAGA 


GAGTCAGAGA 


GAAGGAAAGA 


GAGAGAGGAA 


2350 




GAGACAAAGA 


ATGAH 








2365 
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(2) INFORMATION FOR SEQ ID NO: 95: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 768 amino acids 
5 (B) TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 





SSSRTEGARG 


KCQPMPSPSE 


PRVCLTIESQ 


EVNCLLDTGA 


AFSVLLSCPR 


50 




QLSSRSVTIR 


GVLGQPVTTY 


FSQPLSCDWG 


TLLFSHAFLI 


MPESPTPLLG 


100 


10 


RDILAKAGAI 


IHLNIGKGIP 


ICCPLLEEGI 


NPEVWAIEGQ 


YGQAKNARPV 


150 




QVKLKDSASF 


PYQRKYPLRP 


EALQGXQKIV 


KDLKAQGLVK 


PCSSPCNTPI 


200 




LGVRKPNGQW 


RLVQDLRI IN 


EAVFPLYPAV 


SSPYTLLSLI 


PEEAEWFTVL 


250 




DLKDAFFCIP 


VRPDSQFLFA 


FEDPLNPTSQ 


LTWTVLPQGF 


RDSPHLFGQA 


300 




LAQDLSQFSY 


LDTLVLQYVD 


DLLLVARSET 


LCHQATQELL 


TFLTTCGYKV 


350 


15 


SKPKARLCSQ 


EIRYLGLKLS 


KGTRALSEER 


IQPILAYPHP 


KTLKQLRGFL 


400 




GITGFCRKQI 


PRYTPIARPL 


YTLIRETQKA 


NTYLVRWTPT 


EVAFQALKKA 


450 




LTQAPVFSLP 


TGQDFSLYAT 


EKTGIALGVL 


TQVSGMSLQP 


WYLSKEIDV 


500 




VAKGWPHCLW 


VMAAVAVLVS 


EAVKI IQGRD 


LTVWTSHDVN 


GILTAKGDLW 


550 




LSDNHLLNYQ 


ALLLEEPVLR 


LRTCATLKPA 


TFLPDNEEKI 


EHNCQQVIAQ 


600 


20 


TYAARGDLLE 


VPLTDPDLNL 


YTDGSSLAEK 


GLRKAGYAVI 


SDNGILESNR 


650 




LTPGTSAHLA 


ELIALTWALE 


LGEGKRVNIY 


SDSKYAYLVL 


HAHAAI WRER 


700 




EFLTSEGTP I 


NHQEAIRRLL 


LAVQKPKEVA 


VLHCQGHQEE 


EEREIEGNRQ 


750 




AD I E AKKAAR 


QDSPLEML 








768 



25 (2) INFORMATION FOR SEQ ID NO: 96: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 amino acids 

(B) TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

30 

SSSRTEGARG KCQPMPSPSE PRVCLTIESQ EVNCLLDTGA AFSVLLSCPR 50 
QLSSRSVTIR GVLGQPVTTY FSQPLSCDWG TLLFSHAFLI MPESPTPLLG . 100 
RDILAKAGAI IHLN 114 



35 



(2) INFORMATION FOR SEQ ID NO: 97: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: amino acids 



(B) TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 



5 IGKG I P I CCPLLEEG I NPEVWAI EGQYGQAKN ARPV 

QVKLKDSASFPYQRKYPLRPEALQGXQKIVKDLKAQGLVKPCSSPCNTPI 
LGVRKPNGQWRLVQDLRIINEAVFPLYPAVSSPYTLLSLIPEEAEWFTVL 
DLKDAFFCIPVRPDSQFLFAFEDPLNPTSQLTWTVLPQGFRDSPHLFGQA 
LAQDLSQFSYLDTLVLQYVDDLLLVARSETLCHQATQELLTFLTTCGYKV 

10 SKPKARLCSQEIRYLGLKLSKGTRALSEERIQPILAYPHPKTLKQLRGFL 
GITGFCRKQIPRYTPIARPLYTLIRETQKANTYLVRWTPTEVAFQALKKA 
LTQAPVFSLPTGQDFSLYATEKTGIALGVLTQVSGMSLQPWYLSKEIDV 
VAKGWPHCLWVKAAVAVLVSEAVKIIQGRDLTVWTSHDVNGILTAKGDLW 
LSDNHLLNYQALLLEEPVLRLRTCATLKPATFLPDNEEKIEHNCQQVIAQ 

15 TYAARGDLLEVPLTDPDLNLYTDGSSLAEKGLRKAGY AVI SDNGILESNR 
LTPGTSAHLAELIALTWALELGEGKRVNI YSDSKYAYLVLHAHAAIWRER 
EFLTSEGTPINHQEAIRRLLLAVQKPKEVAVLHCQGHQEEEEREIEGNRQ 
AD I E AKKAARQD SPLEML 

20 

(2) INFORMATION FOR SEQ ID NO: 98: 
(i) SEQUENCE CHARACTERISTICS: 



LYTDGSSLAEKGLRKAGYAVI SDNGILESNR 

LTPGTSAHLAELIALTWALELGEGKRVNI YSDSKYAYLVLHAHAAIWRER 
EFLTSEGTP I NHQE AI RRLLLAVQKPKE VAVLHCQGHQEEEERE I EGNRQ 
3 0 AD I E AKKAARQDS PLEML 

(2) INFORMATION FOR SEQ ID NO: 99 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 base pairs 
35 (B) TYPE: nucleotide 



(A) LENGTH: amino acids 



(B) TYPE: peptide 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 



(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
AGGAGTAAGG AAACCCAACG GAC 23 

5 (2) INFORMATION FOR SEQ ID NO: 100 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 ( D ) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
TAAGAGTTGC ACAAGTGCG 19 



(2) INFORMATION FOR SEQ ID NO: 101 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

TCAGGGATAG CCCCCATCTA T 21 



(2) INFORMATION FOR SEQ ID NO: 102 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
3 0 AACCCTTTGC CACTACATCA ATTT 24 



(2) INFORMATION FOR SEQ ID NO: 103 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 
AGCAGCAGGA CTGAGGGT 18 



5 (2) INFORMATION FOR SEQ ID NO: 104 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
CTGTCCGTTG GGTTTCCTTA CTCCT 2 5 



(2) INFORMATION FOR SEQ ID NO: 105 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

GACAGCAAAT GGGTATTCCT TTCC 24 



(2) INFORMATION FOR SEQ ID NO: 106 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
3 0 AGGAGTAAGG AAACCCAACG GACA 24 



(2) INFORMATION FOR SEQ ID NO: 107 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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<D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 
TGTATATAAT GGTCTGGCTA TTGGG 2 5 



5 (2) INFORMATION FOR SEQ ID NO: 108 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
TTCGGCAGAA ACCTGTTATG CCAAGG 2 6 



(2) INFORMATION FOR SEQ ID NO: 109 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 
{ B ) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

GGCTCTGCTC ACAGGAGATT AGATAC 2 6 



(2) INFORMATION FOR SEQ ID NO: 110 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
3 0 AAAGGCACCA GGGCCCTCAG TGAGGA 2 6 



(2) INFORMATION FOR SEQ ID NO: 111 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 
GGTTTAAGAG TTGCACAAGT GCGCAGTC 28 

5 (2) INFORMATION FOR SEQ ID NO: 112: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 310 base pairs 

( B ) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

GCTTATAGAA GGACCCCTAG TATGGGGTAA TCCCCTCTGG GAAACCAAGC CCCAGTACTC 60 

AGCAGGAAAA ATAGAATAGG AAACCTCACA AGGACATACT TTCCTCCCCT CCAGATGGCT 120 

15 AGCCACTGAG GAAGGAAAAA TACTTTCACC TGCAGCTAAC CAACAGAAAT TACTTAAAAC 180 

CCTTCACCAA ACCTTCCACT TAGGCATTGA TAGCACCCAT CAGATGGCCA AATTATTATT 240 

TACTGGACCA GGCCTTTTCA AAACTATCAA GAAGATAGTC AGGGGCTGTG AAGTGTGCCA 300 

AAGAAATAAT 310 

2 0 (2) INFORMATION FOR SEQ ID NO: 113: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
2 5 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
Leu lie Glu Gly Pro Leu Val Trp Gly Asn Pro Leu Trp Glu Thr Lys 
15 10 15 

30 Pro Gin Tyr Ser Ala Gly Lys lie Glu Xaa Glu Thr Ser Gin Gly His 

20 25 30 

Thr Phe Leu Pro Ser Arg Trp Leu Ala Thr Glu Glu Gly Lys lie Leu 

35 40 45 

Ser Pro Ala Ala Asn Gin Gin Lys Leu Leu Lys Thr Leu His Gin Thr 
35 50 55 60 

Phe His Leu Gly lie Asp Ser Thr His Gin Met Ala Lys Leu Leu Phe 
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65 



70 



75 



80 



Thr Gly Pro Gly Leu Phe Lys Thr lie Lys Lys lie Val Arg Gly Cys 



85 



90 



95 



Glu Val Cys Gin Arg Asn Asn 



100 



10 



15 



20 



25 



30 



(2) INFORMATION FOR SEQ ID NO: 114: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 635 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 
CCCTGTATCT TTAACCTCCT TGTTAAGTTT GTCTCTTCCA GAATCAAAAC TGTAAAACTA 60 r 
CAAATTGTTC TTCAAATGGA GCACCAGATG GAGTCCATGA CTAAGATCCA CCGTGGACCC 120 
CTGGACCGGC CTGCTAGCCC ATGCTCCGAT GTTAATGACA TTGAAGGCAC CCCTCCCGAG 180 
GAAATCTCAA CTGCACAACC CCTACTATGC CCCAATTCAG CGGGAAGCAG TTAGAGCGGT 240 
CATCAGCCAA CCTCCCCAAC AGCACTTGGG TTTTCCTGTT GAGAGGGGGG ACTGAGAGAC 300 
AGGACTAGCT GGATTTCCTA GGCCAACGAA GAATCCCTAA GCCTAGCTGG GAAGGTGACT 360 
GCATCCACCT CTAAACATGG GGCTTGCAAC TTAGCTCACA CCCGACCAAT CAGAGAGCTC 420 
ACTAAAATGC TAATTAGGCA AAAATAGGAG GTAAAGAAAT AGCCAATCAT CTATTGCCTG 480 
AGAGCACAGC GGGAGGGACA AGGATCGGGA TATAAACCCA GGCATTCGAG CCGGCAACGG 540 
CAACCCCCTT TGGGTCCCCT CCCTTTGTAT GGGCGCTCTG TTTTCACTCT ATTTCACTCT 600 
ATTAAATCTT GCAACTGAAA AAAAAAAAAA AAAAA 63 5 

(2) INFORMATION FOR SEQ ID NO: 115: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 
Pro Cys lie Phe Asn Leu Leu Val Lys Phe Val Ser Ser Arg lie Lys 



1 



5 



10 



15 
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Thr Val Lys Leu Gin lie Val Leu Gin Met Glu His Gin Met Glu Ser 

20 25 30 

Met Thr Lys lie His Arg Gly Pro Leu Asp Arg Pro Ala Ser Pro Cys 

35 40 45 

5 Ser Asp Val Asn Asp lie Glu Gly Thr Pro Pro Glu Glu lie Ser Thr 

50 55 60 

Ala Gin Pro Leu Leu Cys Pro Asn Ser Ala Gly Ser Ser 
65 70 75 

10 (2) INFORMATION FOR SEQ ID NO: 116: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
TGGGGTTCCA TTTGTAAGAC CATCTGTAGC TT 32 

20 (2) INFORMATION FOR SEQ ID NO: 117: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1481 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

ATGGCCCTCC CTTATCATAC TTTTCTCTTT ACTGTTCTCT TACCCCCTTT CGCTCTCACT 60 

GCACCCCCTC CATGCTGCTG TACAACCAGT AGCTCCCCTT ACCAAGAGTT TCTATGAAGA 120 

3 0 ACGCGGCTTC CTGGAAATAT TGATGCCCCA TCATATAGGA GTTTATCTAA GGGAAACTCC 180 

ACCTTCACTG CCCACACCCA TATGCCCCGC AACTGCTATA ACTCTGCCAC TCTTTGCATG 240 

CATGCAAATA CTCATTATTG GACAGGGAAA ATGATTAATC CTAGTTGTCC TGGAGGACTT 300 

GGAGCCACTG TCTGTTGGAC TTACTTCACC CATACCAGTA TGTCTGATGG GGGTGGAATT 360 

CAAGGTCAGG CAAGAGAAAA ACAAGTAAAG GAAGCAATCT CCCAACTGAC CCGGGGACAT 420 

3 5 AGCACCCCTA GCCCCTACAA AGGACTAGTT CTCTCAAAAC TACATGAAAC CCTCCGTACC 480 

CATACTCGCC TGGTGAGCCT ATTTAATACC ACCCTCACTC GGCTCCATGA GGTCTCAGCC 540 
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CAAAACCCTA CTAACTGTTG GATGTGCCTC 

CCTGTTCCTG AACAATGGAA CAACTTCAGC 

GGACCTCTTG TTTCCAATCT GGAAATAACC 

AGCAATACTA TAGACACAAC CAGCTCCCAA 

5 ATAGTCTGCC TACCCTCAGG AATATTTTTT 

AATGGCTCTT CAGAATCTAT GTGCTTCCTC 

ACTGAACAAG ATTTATACAA TCATGTCGTA 

CTTCCTTTTG TTATCAGAGC AGGAGTGCTA 

ACAACCTCTA CTCAGTTCTA CTACAAACTA 

10 GTCACTGACT CCCTGGTCAC CTTGCAAGAT 

CAAAATCGAA GAGCTTTAGA CTTGCTAACC 

GGAGAAGAAC GCTGTTATTA TGTTAATCAA 

ATTCGAGATC GAATACAATG TAGAGCAGAG 

CTCAGCCAAT GGATGCCCTG GGTTCTCCCC 

15 TTACTCCTCT TTGGACCCTG TATCTTTAAC 

GAAGCTGTAA AGCTACAGAT GGTCTTACAA 



185 

CCCCTGCACT TCAGGCCATA CATTTCAATC 600 

ACAGAAATAA ACACCACTTC CGTTTTAGTA 660 

CATACCTCAA ACCTCACCTG TGTAAAATTT 720 

TGCATCAGGT GGGTAACACC TCCCACACGA 780 

GTCTGTGGTA CCTCAGCCTA TCATTGTTTG 840 

TCATTCTTAG TGCCCCCTAT GACCATCTAC 900 

CCTAAGCCCC ACAACAAAAG AGTACCCATT 960 

GGCAGACTAG GTACTGGCAT TGGCAGTATC 1020 

TCTCAAGAAA TAAATGGTGA CATGGAACAG 1080 

CAACTTAACT CCCTAGCAGC AGTAGTCCTT 1140 

GCCAAAAGAG GGGGAACCTG TTTATTTTTA 1200 

TCCAGAATTG TCACTGAGAA AGTTAAAGAA 1260 

GAGCTTCAAA ACACCGAACG CTGGGGCCTC 1320 

TTCTTAGGAC CTCTAGCAGC TCTAATATTG 1380 

CTCCTTGTTA AGTTTGTCTC TTCCAGAATT 1440 

ATGGAACCCC A 1481 



(2) INFORMATION FOR SEQ ID NO: 118: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 493 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 
25 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Met Ala Leu Pro Tyr His Thr Phe Leu Phe Thr Val Leu Leu Pro Pro 
15 10 15 

Phe Ala Leu Thr Ala Pro Pro Pro Cys Cys Cys Thr Thr Ser Ser Ser 
20 25 30 

30 Pro Tyr Gin Glu Phe Leu Xaa Arg Thr Arg Leu Pro Gly Asn lie Asp 

35 40 45 

Ala Pro Ser Tyr Arg Ser Leu Ser Lys Gly Asn Ser Thr Phe Thr Ala 

50 55 60 

His Thr His Met Pro Arg Asn Cys Tyr Asn Ser Ala Thr Leu Cys Met 
35 65 70 75 80 

His Ala Asn Thr His Tyr Trp Thr Gly Lys Met lie Asn Pro Ser Cys 
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Pro 
Ser 

5 

Val 

Pro 
145 

10 His 
Glu 
His 

15 

Phe 

Ser 
225 

2 0 Ser 

Pro 
Gly 

25 

Phe 

Leu 
305 

3 0 Leu 

He 
Glu 

35 

Gin 



Gly Gly Leu 
100 

Met Ser Asp 
115 

Lys Glu Ala 
130 

Tyr Lys Gly 

Thr Arg Leu 

Val Ser Ala 
180 

Phe Arg Pro 
195 

Ser Thr Glu 
210 

Asn Leu Glu 

Asn Thr He 

Pro Thr Arg 
260 

Thr Ser Ala 
275 

Leu Ser Phe 
290 

Tyr Asn His 

Pro Phe Val 

Gly Ser He 
340 

He Asn Gly 

355 
Asp Gin Leu 



85 

Gly Ala Thr 

Gly Gly Gly 

He Ser Gin 
135 

Leu Val Leu 

150 
Val Ser Leu 
165 

Gin Asn Pro 

Tyr He Ser 

He Asn Thr 
215 

He Thr His 

230 
Asp Thr Thr 
245 

He Val Cys 

Tyr His Cys 

Leu Val Pro 
295 

Val Val Pro 

310 
He Arg Ala 
325 

Thr Thr Ser 
Asp Met Glu 
Asn Ser Leu 



186 

90 

Val Cys Trp 
105 

He Gin Gly 
120 

Leu Thr Arg 

Ser Lys Leu 

Phe Asn Thr 
170 

Thr Asn Cys 
185 

He Pro Val 
200 

Thr Ser Val 

Thr Ser Asn 

Ser Ser Gin 
250 

Leu Pro Ser 
265 

Leu Asn Gly 
280 

Pro Met Thr 

Lys Pro His 

Gly Val Leu 
330 

Thr Gin Phe 

345 
Gin Val Thr 
360 

Ala Ala Val 



Thr Tyr Phe 

Gin Ala Arg 
125 

Gly His Ser 

140 
His Glu Thr 
155 

Thr Leu Thr 

Trp Met Cys 

Pro Glu Gin 
205 

Leu Val Gly 

220 
Leu Thr Cys 
235 

Cys He Arg 

Gly He Phe 

Ser Ser Glu 
285 

He Tyr Thr 

300 
Asn Lys Arg 
315 

Gly Arg Leu 

Tyr Tyr Lys 

Asp Ser Leu 
365 

Val Leu Gin 



95 

Thr His Thr 
110 

Glu Lys Gin 

Thr Pro Ser 

Leu Arg Thr 
160 

Arg Leu His 
175 

Leu Pro Leu 
190 

Trp Asn Asn 

Pro Leu Val 

Val Lys Phe 
240 

Trp Val Thr 

255 
Phe Val Cys 
270 

Ser Met Cys 

Glu Gin Asp 

Val Pro He 
320 

Gly Thr Gly 
335 

Leu Ser Gin 
350 

Val Thr Leu 
Asn Arg Arg 
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370 375 380 

Ala Leu Asp Leu Leu Thr Ala Lys Arg Gly Gly Thr Cys Leu Phe Leu 
385 390 395 400 

Gly Glu Glu Arg Cys Tyr Tyr Val Asn Gin Ser Arg lie Val Thr Glu 
5 405 410 415 

Lys Val Lys Glu lie Arg Asp Arg lie Gin Cys Arg Ala Glu Glu Leu 

420 425 430 

Gin Asn Thr Glu Arg Trp Gly Leu Leu Ser Gin Trp Met Pro Trp Val 
435 440 445 

10 Leu Pro Phe Leu Gly Pro Leu Ala Ala Leu lie Leu Leu Leu Leu Phe 

450 455 460 

Gly Pro Cys He Phe Asn Leu Leu Val Lys Phe Val Ser Ser Arg He 
465 470 475 480 

Glu Ala Val Lys Leu Gin Met Val Leu Gin Met Glu Pro 
15 485 490 

(2) INFORMATION FOR SEQ ID NO: 119: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 32 base pairs 
2 0 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 
25 TCAAAATCGA AGAGCTTTAG ACTTGCTAAC CG 32 



(2) INFORMATION FOR SEQ ID NO: 120: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1329 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
35 TCAAAATCGA AGAGCTTTAG ACTTGCTAAC CGCCAAAAGA GGGGGAACCT GTTTATTTTT 60 
AGGGGAAGAA TGCTGTTAGT ATGTTAATCA ATCTGGAATC ATTACTGAGA AAGTTAAAGA 120 
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AATTTGAGAT CGAATATAAT 
CCTCAGCCAA TGGATGCCCT 
TTTACTCCTC TTTGGACCCT 
TGAAGCTGTA AAGCTACAAA 
5 AATCTACCGT GGACCCCTGG 
AGTCACCCCT CCCGAGGAAA 
AAGCAGTTAG AGCAGTTGTC 
GGGTGGACTG AGAGACAGGA 
ANCTGGGAAG GTGACCGCAT 
10 ACCAATCAGA GAGCTCACTA 
AATCATCTAT TGCCTGAGAG 
TTCAAGCCAG CAACAGCAAC 
CACTCTATTT CACTCTATTA 
CTCAAGCTGA GCTTTTGTTC 
15 GCTGACTTCC ATCCCTTTGG 
ACCCATTGCC ACTCCCGATC 
TGGGTTTGTC CTAATAGAAC 
CCACGGCTTC TAATAGAGCT 
TCTGTGAGGC CAAGAACCCC 
2 0 CCCACTGCCA TTTTGGTAGC 
CCAGTAACA 



188 

GTAGAGCAGA GGACCTTCAA 
GGACTCTCCC CTTCTTAGGA 
GTATCTTCAA CTTCCTTGTT 
TAGTTCTTCA AATGGAACCC 
ACCGGCCTGC TAGACTATGC 
TCTCAACTGC ACAACCCCTA 
AGCCAACCTC CCCAACAGTA 
CTAGCTGGAT TTCCTAGGCT 
CCATCTTTAA ACATGGGGCT 
AAATGCTAAT CAGGCAAAAA 
CACAGCGGGA AGGACAAGGA 
CCCCTTTGGG TCCCCTCCCA 
AATCATGCAA CTGCACTCTT 
GCCATCCACC ACTGCTGTTT 
ATCCAGCAGA GTGTCCACTG 
AGGCTAAAGG CTTGCCATTG 
TGAACACTGG TCACTGGGTT 
ATAACACTCA CCGCATGGCC 
AGGTCAGAGA ANGTGAGGCT 
GGCCCACCAC CATCTTGGGA 



AACACTGCAC CCTGGGGCCT 180 

CCTCTAGCAG CTATAATATT 240 

AAGTTTGTCT CTTCCAGAAT 300 

CAGATGCAGT CCATGACTAA 360 

TCTGATGTTA ATGACATTGA 420 

CTACACTCCA ATTCAGTAGG 480 

CTTGGGTTTT CCTGTTGAGA 540 

GACTAAGAAT CCCNAAGCCT 600 

TGCAACTTAG CTCACACCCG 660 

CAGGAGGTAA AGCAATAGCC 720 

TTGGGATATA AACTCAGGCA 780 

TTGTATGGGA GCTCTGTTTT 840 

CTGGTCCGTG TTTTTTATGG 900 

GCCACCGTCA CAGACCCGCT 960 

TGCTCCTGAT CCAGCGAGGT 1020 

TTCCTGCATG GCTAAGTGCC 1080 

CCATGGTTCT CTTCCATGAC 1140 

CAAGATTCCA TTCCTTGGTA 1200 

TGCCACCATT TGGGAAGTGG 1260 

GCTGTGGGAG CAAGGATCCC 1320 

1329 



(2) INFORMATION FOR SEQ ID NO: 121: 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Gin Asn Arg Arg Ala Leu Asp Leu Leu Thr Ala Lys Arg Gly Gly Thr 
15 10 15 

Cys Leu Phe Leu Gly Glu Glu Cys Cys Xaa Tyr Val Asn Gin Ser Gly 
20 25 30 

35 lie lie Thr Glu Lys Val Lys Glu lie Xaa Asp Arg lie Xaa Cys Arg 

35 40 45 
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Ala Glu Asp Leu Gin Asn Thr Ala Pro Trp Gly Leu Leu Ser Gin Trp 

50 55 60 

Met Pro Trp Thr Leu Pro Phe Leu Gly Pro Leu Ala Ala lie lie Phe 
65 70 75 80 

5 Leu Leu Leu Phe Gly Pro Cys lie Phe Asn Phe Leu Val Lys Phe Val 

85 90 95 

Ser Ser Arg lie Glu Ala Val Lys Leu Gin lie Val Leu Gin Met Glu 

100 105 110 

Pro Gin Met Gin Ser Met Thr Lys lie Tyr Arg Gly Pro Leu Asp Arg 
10 115 120 125 

Pro Ala Arg Leu Cys Ser Asp Val Asn Asp lie Glu Val Thr Pro Pro 

130 135 140 

Glu Glu lie Ser Thr Ala Gin Pro Leu Leu His Ser Asn Ser Val Gly 
145 150 155 160 

15 Ser Ser 

(2) INFORMATION FOR SEQ ID NO: 122: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

2 5 GGCATTGATA GCACCCATCA G 21 

(2) INFORMATION FOR SEQ ID NO: 123: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 

3 0 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
35 CATGTCACCA GGGTGGAATA G 21 
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(2) INFORMATION FOR SEQ ID NO: 124: 



(i) 



SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 758 base pairs 



(B) TYPE: nucleotide 



5 



(C) STRANDEDNESS : single 



( D ) TOPOLOGY : 1 i near 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

GGCATTGATA GCACCCATCA GATGGCCAAA TCATTATTTA CTGGACCAGG CCTTTTCAAA 60 

10 ACTATCAAGC AGATAGGGCC CGTGAAGCAT G CCAAAG AAA TAATCCCCTG CCTTATCGCC 120 

ATGTTCCTTC AGGAGAACAA AGAACAGGCC ATTACCCAGG GGAAGACTGG CAACTAGATT 180 

TTACCCACAT GGCCAAATGT CAGGGATTTC AGCATCTACT AGTCTGGGCA GATACTTTCA 240 

CTGGTTGGGT GGAGTCTTCT CCTTGTAGGA CAGAAAAGAC CCAAGAGGTA ATAAAGGCAC 300 

TAATGAAATA ATTCCCAGAT TTGGACTTCC CCCAGGATTA CAGGGTGACA ATGGCCCCGC 360 

15 TTTCAAGGCT GCAGTAACCC AGGGAGTATC CCAGGTGTTA GGCATACAAT ATCACTTACA 420 

CTGTGCCTGG AGGCCACAAT CCTCCAGAAA AGTCAAGAAA ATGAATGAAA CACTCAAAGA 480 

TCTAAAAAAG CTAACCCAAG AAACCCACAT TGCATGACCT GTTCTGTTGC CTATAACCTT 540 

ACTAAGAATC CATAACTATC CCCCAAAAAG CAGGACTTAG CCCATACGAG ATGCTATATG 600 

GATGGCCTTT CCTAACCAAT GACCTTGTGC TTGACTGAGA AATGGCCAAC TTAGTTGCAG 660 

20 ACATCACCTC CTTAGCCAAA TATCAACAAG TTCTTAAAAC ATCACAGGGA ACCTGTCCCC 720 

GAGAGGAGGG AAAGGAACTA TTCCACCCTG GTGACATG 758 

(2) INFORMATION FOR SEQ ID NO: 12 6: 
25 (i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 25 base pairs 



(B) TYPE: nucleotide 



(C) STRANDEDNESS: single 



<D) TOPOLOGY: linear 



30 



(ii) 



TYPE DE MOLECULE: ADNc 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 126: 



CGGACATCCA AAGTGATGGG AAACG 



25 



35 



(2) INFORMATION FOR SEQ ID NO: 127: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNC 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

GGACAGGAAA GTAAGACTGA GAAGGC 26 

(2) INFORMATION FOR SEQ ID NO: 128: 
(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc . . 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

CCTAGAACGT ATTCTGGAGA ATTGGG 26" 

(2) INFORMATION FOR SEQ ID NO: 129: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 26 base pairs ■ 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

TGGCTCTCAA TGGTCAAACA TACCCG 26 

(2) INFORMATION FOR SEQ ID NO: 130: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1511 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

CCTAGAACGT ATTCTGGAGA ATTGGGACCA ATGTGACACT CAGACGCTAA GAAAGAAACG 60 
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ATTTATATTC 


TTCTGCAGTA 


CCGCCTGGCC 


ACAATATCCT 


CTTCAAGGGA 


GAGAAACCTG 


120 




GCTTCCTGAG 


GGAAGTATAA 


ATTATAACAT 


CATCTTACAG 


CTAGACCTCT 


TCTGTAGAAA 


180 




GGAGGGCAAA 


TGGAGTGAAG 


TGCCATATGT 


GCAAACTTTC 


TTTTCATTAA 


GAGACAACTC 


240 




ACAATTATGT 


AAAAAGTGTG 


GTTTATGCCC 


TACAGGAAGC 


CCTCAGAGTC 


CACCTCCCTA 


300 


5 


CCCCAGCGTC 


CCCTCCCCGA 


CTCCTTCCTC 


AACTAATAAG 


GACCCCCCTT 


TAACCCAAAC 


360 




GGTCCAAAAG 


GAGATAGACA 


AAGGGGTAAA 


CAATGAACCA 


AAGAGTGCCA 


ATATTCCCCG 


420 




ATTATGCCCC 


CTCCAAGCAG 


TGAGAGGAGG 


AGAATTCGGC 


CCAGCCAGAG 


TGCCTGTACC 


480 




TTTTTCTCTC 


TCAGACTTAA 


AGCAAATTAA 


AATAGACCTA 


GGTAAATTCT 


CAGATAACCC 


540 




TGACGGCTAT 


ATTGATGTTT 


TACAAGGGTT 


AGGACAATCC 


TTTGATCTGA 


CATGGAGAGA 


600 


10 


TATAATGTTA 


CTACTAAATC 


AGACACTAAC 


CCCAAATGAG 


AGAAGTGCCG 


CTGTAACTGC 


660 




AGCCCGAGAG 


TTTGGCGATC 


TTTGGTATCT 


CAGTCAGGCC 


AACAATAGGA 


TGACAACAGA 


720 




GGAAAGAACA 


ACTCCCACAG 


GCCAGCAGGC 


AGTTCCCAGT 


GTAGACCCTC 


ATTGGGACAC 


780 




AGAATCAGAA 


CATGGAGATT 


GGTGCCACAA 


ACATTTGCTA 


ACTTGCGTGC 


TAGAAGGACT 


840 




GAGGAAAACT 


AGGAAGAAGC 


CTATGAATTA 


CTCAATGATG 


TCCACTATAA 


CACAGGGAAA 


900 


15 


GGAAGAAAAT 


CTTACTGCTT 


TTCTGGACAG 


ACTAAGGGAG 


GCATTGAGGA 


AGCATACCTC 


960 




CCTGTCACCT 


GACTCTATTG 


AAGGCCAACT 


AATCTTAAAG 


GATAAGTTTA 


TCACTCAGTC 


1020 




AGCTGCAGAC 


AT TAG AAAAA 


ACTTCAAAAG 


TCTGCCTTAG 


GCCCGGAGCA 


GAACTTAGAA 


1080 




ACCCTATTTA 


ACTTGGCATC 


CTCAGTTTTT 


TATAATAGAG 


ATCAGGAGGA 


GCAGGCGAAA 


1140 




CGGGACAAAC 


GGGAT AAAAA 


AAAAAGGGGG 


GGTCCACTAC 


TTTAGTCATG 


GCCCTCAGGC 


1200 


20 


AAGCAGACTT 


TGGAGGCTCT 


GCAAAAGGGA 


AAAGCTGGGC 


AAATCAAATG 


CCTAATAGGG 


1260 




CTGGCTTCCA 


GTGCGGTCTA 


CAAGGACACT 


TTAAAAAAGA 


TTATCCAAGT 


AGAAATAAGC 


1320 




CGCCCCCTTG 


TCCATGCCCC 


TTACGTCAAG 


GGAATCACTG 


GAAGGCCCAC 


TGCCCCAGGG 


1380 




GATGAAGATA 


CTCTGAGTCA 


GAAGCCATTA 


ACCAGATGAT 


CCAGCAGCAG 


GACTGAGGGT 


1440 




GCCCGGGGCG 


AGCGCCAGCC 


CATGCCATCA 


CCCTCACAGA 


GCCCCGGGTA 


TGTTTGACCA 


1500 


25 


TTGAGAGCCA 


A 










1511 



(2) INFORMATION FOR SEQ ID NO: 131: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 amino acids 
3 0 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 
35 Leu Glu Arg lie Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr Leu 

15 10 15 
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Arg Lys Lys Arg Phe lie Phe Phe Cys Ser Thr Ala Trp Pro Gin Tyr 

20 25 30 

Pro Leu Gin Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser lie Asn Tyr 
35 40 45 

5 Asn lie lie Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys Trp 

50 55 60 

Ser Glu Val Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn Ser 
65 70 75 80 

Gin Leu Cys Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin Ser 
10 85 90 95 

Pro Pro Pro Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr Asn 

100 105 110 

Lys Asp Pro Pro Leu Thr Gin Thr Val Gin Lys Glu lie Asp Lys Gly 
115 120 125 

15 Val Asn Asn Glu Pro Lys Ser Ala Asn lie Pro Arg Leu Cys Pro Leu 

130 135 140 

Gin Ala Val Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val Pro 
145 150 155 160 

Phe Ser Leu Ser Asp Leu Lys Gin lie Lys lie Asp Leu Gly Lys Phe 
20 165 170 175 

Ser Asp Asn Pro Asp Gly Tyr He Asp Val Leu Gin Gly Leu Gly Gin 

180 185 190 

Ser Phe Asp Leu Thr Trp Arg Asp He Met Leu Leu Leu Asn Gin Thr 
195 200 205 

25 Leu Thr Pro Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu Phe 

210 215 220 

Gly Asp Leu Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr Glu 
225 230 235 240 

Glu Arg Thr Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp Pro 
30 245 250 255 

His Trp Asp Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His Leu 

260 265 270 

Leu Thr Cys Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro Met 
275 280 285 

35 Asn Tyr Ser Met Met Ser Thr He Thr Gin Gly Lys Glu Glu Asn Leu 

290 295 300 
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Thr Ala Phe Leu 
305 

Leu Ser Pro Asp 

5 He Thr Gin Ser 

340 



19< 

Asp Arg Leu Arg 
310 

Ser He Glu Gly 
325 

Ala Ala Asp He 



Glu Ala Leu Arg 
315 

Gin Leu He Leu 
330 

Arg Lys Asn Phe 
345 



Lys His Thr Ser 
320 

Lys Asp Lys Phe 
335 

Lys Ser Leu Pro 
350 



(2) INFORMATION FOR SEQ ID NO: 132: 
(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 
15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

TGCTGGAATT CGGGATCCTA GAACGTATTC 30 



(2) INFORMATION FOR SEQ ID NO: 133: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

AGTTCTGCTC CGAAGCTTAG GCAGACTTTT 30 



(2) INFORMATION FOR SEQ ID NO: 135: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 398 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
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15 10 15 

Arg Gly Ser His Met Ala Ser Met Thr Gly Gly Gin Gin Met Gly Arg 

20 25 30 

lie Leu Glu Arg lie Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr 
5 35 40 45 

Leu Arg Lys Lys Arg Phe lie Phe Phe Cys Ser Thr Ala Trp Pro Gin 

50 55 60 

Tyr Pro Leu Gin Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser lie Asn 
65 70 75 80 

10 Tyr Asn lie lie Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys 

85 90 95 

Trp Ser Glu Val Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn 

100 105 110 

Ser Gin Leu Cys Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin 
15 115 120 125 

Ser Pro Pro Pro Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr 

130 135 140 

Asn Lys Asp Pro Pro Leu Thr Gin Thr Val Gin Lys Glu lie Asp Lys 
145 150 155 160 

20 Gly Val Asn Asn Glu Pro Lys Ser Ala Asn lie Pro Arg Leu Cys Pro 

165 170 175 

Leu Gin Ala Val Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val 

180 185 190 

Pro Phe Ser Leu Ser Asp Leu Lys Gin lie Lys lie Asp Leu Gly Lys 
25 195 200 205 

Phe Ser Asp Asn Pro Asp Gly Tyr lie Asp Val Leu Gin Gly Leu Gly 

210 215 220 

Gin Ser Phe Asp Leu Thr Trp Arg Asp lie Met Leu Leu Leu Asn Gin 
225 230 235 240 

30 Thr Leu Thr Pro Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu 

245 250 255 

Phe Gly Asp Leu Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr 

260 265 270 

Glu Glu Arg Thr Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp 
35 275 280 285 

Pro His Trp Asp Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His 
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290 295 300 

Leu Leu Thr Cys Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro 
305 310 315 320 

Met Asn Tyr Ser Met Met Ser Thr lie Thr Gin Gly Lys Glu Glu Asn 
5 325 330 335 

Leu Thr Ala Phe Leu Asp Arg Leu Arg Glu Ala Leu Arg Lys His Thr 

340 345 350 

Ser Leu Ser Pro Asp Ser He Glu Gly Gin Leu He Leu Lys Asp Lys 
355 360 365 

10 Phe He Thr Gin Ser Ala Ala Asp lie Arg Lys Asn Phe Lys Ser Leu 

370 375 380 

Pro Lys Leu Ala Ala Ala Leu Glu His His His His His His 
385 390 395 

15 (2) INFORMATION FOR SEQ ID NO: 137: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 378 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

2 0 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Met Ala Ser Met Thr Gly Gly Gin Gin Met Gly Arg lie Leu Glu Arg 

15 10 15 

25 He Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr Leu Arg Lys Lys 

20 25 30 

Arg Phe He Phe Phe Cys Ser Thr Ala Trp Pro Gin Tyr Pro Leu Gin 

35 40 45 

Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser He Asn Tyr Asn He He 

30 50 55 60 

Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys Trp Ser Glu Val 

65 70 75 80 

Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn Ser Gin Leu Cys 

85 90 95 

3 5 Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin Ser Pro Pro Pro 

100 105 110 
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Tyr 
Pro 

5 Glu 
145 
Arg 

Ser 

10 

Pro 
Leu 

15 Asn 
225 
Trp 

Thr 

2 0 

Thr 
Val 

2 5 Met 
305 
Leu 

Asp 

30 

Ser 
Ala 

35 



Pro Ser Val 
115 

Leu Thr Gin 
130 

Pro Lys Ser 

Gly Gly Glu 

Asp Leu Lys 
180 

Asp Gly Tyr 
195 

Thr Trp Arg 
210 

Glu Arg Ser 

Tyr Leu Ser 

Pro Thr Gly 
260 

Glu Ser Glu 
275 

Leu Glu Gly 
290 

Met Ser Thr 

Asp Arg Leu 

Ser lie Glu 
340 

Ala Ala Asp 
355 

Ala Leu Glu 
370 



Pro Ser Pro 

Thr Val Gin 
135 

Ala Asn lie 

150 
Phe Gly Pro 
165 

Gin lie Lys 

lie Asp Val 

Asp lie Met 
215 

Ala Ala Val 

230 
Gin Ala Asn 
245 

Gin Gin Ala 

His Gly Asp 

Leu Arg Lys 
295 

lie Thr Gin 

310 
Arg Glu Ala 
325 

Gly Gin Leu 

lie Arg Lys 

His His His 
375 
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Thr Pro Ser 
120 

Lys Glu lie 

Pro Arg Leu 

Ala Arg Val 
170 

lie Asp Leu 
185 

Leu Gin Gly 
200 

Leu Leu Leu 

Thr Ala Ala 

Asn Arg Met 
250 

Val Pro Ser 

265 
Trp Cys His 
280 

Thr Arg Lys 

Gly Lys Glu 

Leu Arg Lys 
330 

lie Leu Lys 
345 

Asn Phe Lys 
360 

His His His 



Ser Thr Asn 
125 

Asp Lys Gly 

140 
Cys Pro Leu 
155 

Pro Val Pro 

Gly Lys Phe 

Leu Gly Gin 
205 

Asn Gin Thr 

220 
Arg Glu Phe 
235 

Thr Thr Glu 

Val Asp Pro 

Lys His Leu 
285 

Lys Pro Met 

300 
Glu Asn Leu 
315 

His Thr Ser 

Asp Lys Phe 

Ser Leu Pro 
365 



Lys Asp Pro 

Val Asn Asn 

Gin Ala Val 
160 

Phe Ser Leu 
175 

Ser Asp Asn 
190 

Ser Phe Asp 

Leu Thr Pro 

Gly Asp Leu 
240 

Glu Arg Thr 

255 
His Trp Asp 
270 

Leu Thr Cys 

Asn Tyr Ser 

Thr Ala Phe 
320 

Leu Ser Pro 

335 
lie Thr Gin 
350 

Lys Leu Ala 



INFORMATION FOR SEQ ID NO: 138: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 
CTTGGAGGGT GCATAACCAG GGAAT 25 



10 (2) INFORMATION FOR SEQ ID NO: 139: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

( B ) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 
TGTCCGCTGT GCTCCTGATC 20 

2 0 (2) INFORMATION FOR SEQ ID NO: 140: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
2 5 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 
CTATGTCCTT TTGGACTGTT TGGGT 25 



30 (2) INFORMATION FOR SEQ ID NO: 141: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 764 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
3 5 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 



TGTCCGCTGT 


GCTCCTGATC 


CAGCACAGGC 


GCCCATTGCC 


TCTCCCAATT 


GGGCTAAAGG 


60 


CTTGCCATTG 


TTCCTGCACA 


GCTAAGTGCC 


TGGGTTCATC 


CTAATCGAGC 


TGAACACTAG 


120 


TCACTGGGTT 


CCACGGTTCT 


CTTCCATGAC 


CCATGGCTTC 


TAATAGAGCT 


ATAACACTCA 


180 


CTGCATGGTC 


CAAGATTCCA 


TTCCTTGGAA 


TCCGTGAGAC 


CAAGAACCCC 


AGGTCAGAGA 


240 


ACACAAGGCT 


TGCCACCATG 


TTGGAAGCAG 


CCCACCACCA 


TTTTGGAAGC 


AGCCCGCCAC 


300 


TATCTTGGGA 


GCTCTGGGAG 


CAAGGACCCC 


AGGTAACAAT 


TTGGTGACCA 


CGAAGGGACC 


360 


TGAATCCGCA 


ACCATGAAGG 


GATCTCCAAA 


GCAATTGGAA 


ATGTTCCTCC 


CAAGGCAAAA 


420 


ATGCCCCTAA 


GATGTATTCT 


GGAGAATTGG 


GACCAATTTG 


ACCCTCAGAC 


AGTAAGAAAA 


480 


AAATGACTTA 


TATTCTTCTG 


CAGTACCGCC 


CTGGCCACGA 


TATCCTCTTC 


AAGGGGGAGA 


540 


AACCTGGCCT 


CCTGAGGGAA 


GTATAAATTA 


TAACACCATC 


TTACAGCTAG 


ACCTGTTTTG 


600 


TAGAAAAGGA 


GGCAAATGGA 


GTGAAGTGCC 


ATATTTACAA 


ACTTTCTTTT 


CATTAAAAGA 


660 


CAACTCGCAA 


TTATGTTAAC 


AGTGTGATTT 


GTGTTCCTAC 


ACGGAAGCCC 


TCAGATTCTA 


720 


CTCCCCACCC 


CCGGCATCTC 


CCCTGAATCC 


CTCCCCAACT 


TATT 




764 



15 

(2) INFORMATION FOR SEQ ID NO: 142: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 800 base pairs 

(B) TYPE: nucleotide 

20 (C) STRANDEDNESS: single 



(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNC 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 



TGTCCGCTGT 


GCTCCTGATC 


CAGCACAGGC 


GCCCATTGCC 


TCTCCCAATT 


GGGCTAAAGG 


60 


CTTGCCATTG 


TTCCTGCACA 


GCTAAGTGCC 


TGGGTTCATC 


CTAATCGAGC 


TGAACACTAG 


120 


TCACTGGGTT 


CCACGGTTCT 


CTTCCATGAC 


CCATGGCTTC 


TAATAGAGCT 


ATAACACTCA 


180 


CTGCATGGTC 


CAAGATTCCA 


TTCCTTGGAA 


TCCGTGAGAC 


CAAGAACCCC 


AGGTCAGAGA 


240 


ACACAAGGCT 


TGCCACCATG 


TTGGAAGCAG 


CCCACCACCA 


TTTTGGAAGC 


GGCCCGCCAC 


300 


TATCTTGGGA 


GCTCTGGGAG 


CAAGGACCCC 


CAGGTAACAA 


TTTGGTGACC 


ACGAAGGGAC 


360 


CTGAATCCGC 


AACCATGAAG 


GGATCTCCAA 


AGCAATTGGA 


AATGTTCCTC 


CCAAGGCAAA 


420 


AATGCCCCTA 


AGATGTATTC 


TGGAGAATTG 


GGACCAATCT 


GACCCTCAGA 


CAGTAAGAAA 


480 


AAAAATGACT 


TATATTCTTC 


TGCAGTACCG 


CCTGGCCACG 


GATATCCTCT 


TCAAGGGGGA 


540 


GAAACCTGGC 


CTCCTGAGGG 


AAGTATAAAT 


TATAACACCA 


TCTTACAGCT 


AGACCTGTTT 


600 


TGTAGAAAAG 


GAGGCAAATG 


GAGTGAAGTG 


CCATATTTAC 


AAACTTTCTT 


TTCATTAAAA 


660 


GACAACTCGC 


AATTATGTAA 


ACAGTGTGAT 


TTGTGTCCTA 


CAGGAAGCCC 


TCAGATCTAC 


720 


CTCCCTACCC 


CGGCATCTCC 


CTGACTCCTT 


CCCCAACTAA 


TAAGGACCCA 


CTTCAGCCCA 


780 
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AACAGTCCAA AAGGACATAG 



800 



10 



15 



20 



25 



30 



(2) INFORMATION FOR SEQ ID NO: 169: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: base pairs 

( B ) TYPE: nucleotide 

<C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 
consensus (41/68-1 + 42/68-1 + cl43 68-1) 

(2) INFORMATION FOR SEQ ID NO: 170: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 
GACTTGAGCC AGTCCTCATA CCTGGACACT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 
TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 
GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 
TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 
ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 
GGTTTCTGCC AAATATGGAT TCCCAGGTAC AG C AAG AT AG CCAGACCATT AAATACACGA 360 
ATTAAGGAAA CTCAAAAAGC CAATACCCAT TTAGTAAGAT GGACACCTGA AGCAGAAGTG 420 
GCTTTCCAGG CCCTAAAG 438 

(2) INFORMATION FOR SEQ ID NO: 171: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

GACTTGAGCC AGTCCTCATA CCTGGACACT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 

TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 

GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 

TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 

ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 

GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAAGTAG CCAGACCATT AAATACACGA 360 

ATTAAGGAAA CTCAAAAAGC CAGTACCCAT TTAGTAAGAT GGACACCTGA AGCAGAAGTG 400 
GCTTTCCAGG CCCTAAAG . 438 



(2) INFORMATION FOR SEQ ID NO: 172: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 
GACTTGAGCC AGTCYTCATA CCTGGACAYT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 

2 0 TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 
GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 
TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 
ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 
GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAAATAG CCAGACCATT AAATACACGA 360 

25 ATTAAGGAAA CTCAAAAAGC CAATACCCAT TTAGTAAGAT GG AC ATCTG A AGCAGAAGTG 400 
GCTTTCCAGG CCCTAAAG 438 

(2) INFORMATION FOR SEQ ID NO: 173: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 146 amino acids 

( B ) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

DLSQSSYLDT LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 50 
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KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKIARPLNTR IKETQKANTH LVRWTPEAEV AFQALK 146 

(2) INFORMATION FOR SEQ ID NO: 174: 
5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 (ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
DLSQSSYLDT LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 50 
KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKVARPLNTR IKETQKASTH LVRWTPEAEV AFQALK 14 6 

15 

(2) INFORMATION FOR SEQ ID NO: 175: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
DLSQSSYLDX LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 50 

2 5 KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 

GFCQIWIPRY SKIARPLNTR IKETQKANTH LVRWTSEAEV AFQALK 14 6 

(2) INFORMATION FOR SEQ ID NO: 176: 
(i) SEQUENCE CHARACTERISTICS: 

3 0 (A) LENGTH: base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

consensus ( 1/46-7+8/46-7+C15/46/7 ) 
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(2) INFORMATION FOR SEQ ID NO: 17 7: 



(i) 



SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 429 base pairs 



5 



(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) 



TYPE DE MOLECULE: ADNc 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 177: 



10 GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCTTC AGTATGGGGA TGACTTAATT 60 

ATAGCCACCC ATTCAGAAAC CTTGTGGCAT CAAGCCACCC AAGCGCTCTT AAATTTCCTT 120 

GCTACCTGTG GCTCCAAACA AAAGGCTCAC CTCTGCTCAC ACCAGGTTAA ATACTTAGGG 180 

CTAAAATTAT CCAAAGTCAC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGCT 240 

TATCCTCATC CCATAACCCT AAAGCAACTA AGAGGGTTCC TTGGCATATC AGCCTTCTGC 300 

15 CGAATATGGA TTCCCGGATA CAGTGAAATA GCCAGGCCAT TATGTACATT AATTAAGGAA 360 

ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 420 

GCCCTAAAG 429 

(2) INFORMATION FOR SEQ ID NO: 178: 
2 0 <i) SEQUENCE CHARACTERISTICS: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCTTC AGTATAGGGA TGATTTAATT 60 

ATAGCCACCC ATTCAGAAAC CTTGTGGCAT CAAGCCACCC AAGTGCTCTT AAATTTCCTC 120 

GCTACCTGTG GCTCCAAACA AAGGGCTCAG CTCTGCTCAC AGCAGGTTAA ATACTTAGGG 180 

3 0 CTAAAATTAT CCAAAGTCGC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGAT 240 

TATCCTCATC CCAAAACCAT AAAGCAACTA AGAGGGTTCC TTGGCATAAC AGCCTTCTGC 300 

CGAATATGGA TTCCCCGATA CAGTGAAATA GCCAGGCCAT TATGTACATT AGTTAAGGAA 360 

ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AGACAGAAGT GGCTTTCCAG 420 

GCCCTAAAG 429 



(A) LENGTH: 429 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 



(ii) TYPE DE MOLECULE: ADNc 



35 



<2) INFORMATION FOR SEQ ID NO: 179: 



WO 98/23755 PCT/IB9 7/0 1482 



204 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 base pairs 
<B) TYPE: nucleotide 
(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 
GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCCTC AGTATGGGGA TGATTTAATT 60 
ATAGCCACCC ATTCAGAAAC CTTGTGGCAC CAAGCCACCC AAGCGCTCTT AAATTTCCTC 120 

10 GCTACCTGTG GCTCCAAACA AAAGGCTCAG CTCTGCTCAC AGCAGGTTAA ATACTTAGGG 180 
CTAAAATTAT CCAAAGTCAC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGCT 240 
TATCCCCATC CCAAAACCCT AAAGCAACTA AGARGGTTCC TTGGCATAAC AGCCTTCTGC 300 
CGAATATGGA TTCCCAGATA CAGCGAAATA GCCAGGCCAT TATGTACATT ATCTAAGGAA 3 60 
ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 420 

15 GCCCTAAAG 429 

(2) INFORMATION FOR SEQ ID NO: 180: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 

25 DLSQSSYLDI LVLQYGDDLI I ATHSETLWH QATQALLNFL ATCGSKQKAH 50 

LCSHQVKYLG LKLSKVTRAL REERIQRILA YPHP ITLKQL RGFLGISAFC 100 

RIWIPGYSEI ARPLCTLIKE TQKANTHIVR WTPETEVAFQ ALK 143 

(2) INFORMATION FOR SEQ ID NO: 181: 
3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

3 5 (ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 
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DLSQSSYLDI LVLQYRDDLI IATHSETLWH QATQVLLNFL ATCGSKQRAQ 50 

LCSQQVKYLG LKLSKVARAL REERIQRILD YPHPKTIKQL RGFLGITAFC 100 

RIWIPRYSEI ARPLCTLVKE TQKANTHIVR WTPETEVAFQ ALK 143 

5 (2) INFORMATION FOR SEQ ID NO: 182: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 
DLSQSSYLDI LVPQYGDDLI IATHSETLWH QATQALLNFL ATCGSKQKAQ 50 
LCSQQVKYLG LKLSKVTRAL REERIQRILA YPHPKTLKQL RXFLGITAFC 100 
15 RIWIPRYSEI ARPLCTLSKE TQKANTHIVR WTPETEVAFQ ALK 143 

(2) INFORMATION FOR SEQ ID NO: 183: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

2 5 GGCCAGGCAT CAGCCCAAGA CTTGA 2 5 

(2) INFORMATION FOR SEQ ID NO: 184: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

3 5 TGCAAGCTCA TCCCTSRGAC CT 22 
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(2) INFORMATION FOR SEQ ID NO: 185: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
<B) TYPE: nucleotide 
<C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 
GACTTGAGCC AGTCCTCATA CCT 23 



(2) INFORMATION FOR SEQ ID NO: 186: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

( B ) TYPE: nucleotide 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 
CTTTAGGGCC TGGAAAGCCA CT 22 



20 
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CLAIMS 

1. Nucleic material, in the isolated or 
purified state, comprising a nucleotide sequence selected 

5 from the group including sequences SEQ ID NO: 93, SEQ ID 
NO: 94, their complementary sequences and their equivalent 
sequences, in particular nucleotide sequences displaying, 
for any succession of 100 contiguous monomers, at least 
50% and preferably at least 60% homology with said 
10 sequence SEQ ID NO: 93, SEQ ID NO: 94 and their 
complementary sequences, excluding HSERV-9 sequence. 

2. Nucleic material of claim 1, nucleotide 
sequence of which is selected from the group including 
sequences SEQ ID NO: 93, SEQ ID NO: 94, their complementary 

15 sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 70% and preferably at least 
80% homology with said sequence SEQ ID NO: 93, SEQ ID NO: 94 
and their complementary sequences. 

20 3. Nucleic material, in the isolated or 

purified state, coding for any polypeptide displaying, for 
any contiguous succession of at least 30 amino acids, at 
least 50%, preferably at least 60 %, and most preferably 
at least 70% homology with a peptide sequence encoded by 

25 any nucleotide sequence selected from the group including 
SEQ ID NO: 93, SEQ ID NO: 94 and their complementary 
sequence. 

4. Nucleic material, in the isolated or 
purified state, of retroviral type, comprising a 

30 nucleotide sequence identical or equivalent to at least 
part of the pol gene of an isolated retrovirus associated 
with multiple sclerosis or rheumatoid arthritis. 

5. Nucleic material as claimed in claim 4, 
said nucleotide sequence being 80 % homologous to said at 

35 least part of the pol gene. 



WO 98/23755 PCT/IB97/01482 



6. Nucleic material comprising a nucleotide 
sequence identical or equivalent to at least part of the 
pol gene of an isolated virus encoding a reverse 
transcriptase comprising an enzymatic site comprised 

5 between the amino acid domains LPQG and YXDD , said virus 
having a phylogenic distance with HSERV-9 of 0.063 + 0.1, 
and preferably 0.063 + 0.05. 

7. Nucleotide fragment comprising a nucleotide 
sequence selected from the group including SEQ ID NO: 93, 

10 SEQ ID NO: 94, their complementary sequences and their 
equivalent sequences, in particular nucleotide sequences 
displaying, for any succession of 100 contiguous monomers, 
at least 50% and preferably at least 60% homology with 
said sequences and their complementary sequences, said 

15 group excluding SEQ ID NO:l, and said nucleotide fragment 
not comprising nor consisting of the sequence HSERV-9. 

8. Nucleotide fragment of claim 7, nucleotide 
sequence of which is selected from the group including SEQ 
ID NO: 93, SEQ ID NO: 94, their complementary sequences and 

20 their equivalent sequences, in particular nucleotide 
sequences displaying, for any succession of 100 contiguous 
monomers, at least 70% and preferably at least 80% 
homology with said sequences and their complementary 
sequences . 

2 5 9 . Nucleotide fragment comprising a coding 

nucleotide sequence which is at least partially identical 
to a nucleotide sequence selected from the group 
including : 

SEQ ID NO: 93, SEQ ID NO: 94; their complementary 
30 sequences ; their equivalent sequences, in particular 
homologous to SEQ ID NO: 93, SEQ ID NO: 94; 

sequences encoding at least part of the peptide 
sequence defined by SEQ ID NO: 95; 

sequences encoding at least part of a peptide 
35 sequence equivalent, in particular homologous to SEQ ID 
NO: 95, which is capable of being recognized by sera of 
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patients infected with the MSRV-1 virus, or in whom the 
MSRV-1 virus has been reactivated. 

10. Nucleic acid probe for the detection of a 
virus associated with multiple sclerosis or rheumatoid 

5 arthritis, characterized in that it is capable of 
hybridizing specifically with any fragment according to 
any one of claim 7 to 9. 

11, Probe as claimed in claim 10, consisting of 
between 10 and 1,000 monomers. 

10 12. Primer for the amplification by 

polymerization of an RNA or a DNA of a viral material 
associated with multiple sclerosis or rheumatoid 
arthritis, comprising a nucleotide sequence identical or 
equivalent to at least one portion of the nucleotide 

15 sequence of a fragment as claimed in any one of claims 7 
to 9 , in particular a nucleotide sequence displaying, for. 
any succession of at least 10 contiguous monomers/ 
preferably 15 contiguous monomers, more preferably 18 
contiguous monomers and even most preferably 2 0 contiguous 

20 monomers, at least 70% homology with at least the said 
portion of the said fragment. 

13. Primer as claimed in Claim 12, comprising a 
sequence selected from the group consisting of SEQ ID NO: 
99 to SEQ ID NO: 111. 

25 14. Polypeptide encoded by any open reading 

frame belonging to a nucleotide fragment as claimed in any 
one of claims 7 to 9. 

15. Polypeptide of claim 14, characterized in 
that the open reading frame encoding it, is comprised, in 

30 the 5' -3' direction, between nucleotide 18 and nucleotide 
2304 of SEQ ID NO:93. 

16. Polypeptide according to claim 15, 
comprising a peptide sequence at least partially identical 
to SEQ ID NO: 95. 

35 17. Polypeptide, comprising a peptide sequence 

at least partially identical to SEQ ID NO: 96. 
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18. Polypeptide of claim 17 exhibiting an 
enzymatic activity consisting of proteolytic activity. 

19. Polypeptide , characterized in that the open 
reading frame encoding it begins , in the S 1 ^' direction, 

5 at nucleotide 18 and ends at nucleotide 340 of SEQ ID 
NO : 9 3 . 

20. Polypeptide exhibiting an inhibitory 
activity on the proteolytic activity of polypeptide of 
claim 18. 

10 21. Polypeptide, comprising a peptide sequence 

identical or equivalent to SEQ ID NO: 97. 

22. Polypeptide of claim 21, comprising a 
peptide sequence identical or equivalent to SEQ ID NO: 98. 

23. Polypeptide, characterized in that the open 
15 reading frame encoding it begins, in the 5 '-3* direction, 

at nucleotide 341 and ends at nucleotide 2304 of SEQ ID 
NO: 93 . 

24. Polypeptide, characterized in that the open 
reading frame encoding it begins, in the 5 I -3 I direction, 

20 at nucleotide 1858 and ends at nucleotide 2304 of SEQ ID 
NO : 9 3 . 

25. Polypeptide of claim 21 or 23, exhibiting a 
reverse transcriptase activity. 

26. Polypeptide of claim 22 or 24, exhibiting a 
25 ribonuclease H activity. 

27. Polypeptide exhibiting an inhibitory 
activity on the reverse transcriptase activity of 
polypeptide of claim 25. 

28. Polypeptide having an inhibitory activity 
30 on the ribonuclease H activity of polypeptide of claim 26. 

29. Antigenic polypeptide recognized from the 
sera of patients infected with the MSRV-1 virus, and/or in 
whom the MSRV-l virus has been reactivated, characterized 
in that its peptide sequence is at least partially 

35 identical or is equivalent to a sequence selected from the 
group consisting of SEQ ID NO: 95, and fragments thereof, 
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in particular SEQ ID NO: 96, SEQ ID NO: 97 and SEQ ID NO: 
98 . 

30. Mono- or polyclonal antibody directed 
against the MSRV-l virus, characterized in that it is 

5 obtained by the immunological reaction of a human or 
animal body or cells to an immunogenic agent consisting of 
an antigenic polypeptide of claim 29. 

31. Reagent for detection of the MSRV-l 
virus, or of an exposure to the said virus, characterized 

10 in that it comprises at least one reactive substance 
selected from the group consisting of a probe as claimed 
in claim 10 or 11 ; a polypeptide as claimed in any one of 
claims 14 to 29 ; or an antibody as claimed in claim 30, 

32. Diagnostic, prophylactic or therapeutic 
15 composition, in particular for inhibiting the expression 

of a virus associated with multiple sclerosis or 
rheumatoid arthritis, and/or the enzymatic activity of the 
proteins of said virus, said composition comprising a 
nucleotide fragment of any one of claims 7 to 9. 

20 33. Diagnostic, prophylactic or therapeutic 

composition comprising a polypeptide of any one of claims 
14 to 29, or an antibody of claim 30. 

34. Process for detecting a virus associated 
with multiple sclerosis or rheumatoid arthritis, in a 

25 biological sample, characterized in that an RNA and/or a 
DNA presumed to belong or originating from said virus, or 
their complementary RNA and/or DNA, is/are brought into 
contact with a nucleotide fragment according to any one of 
claim 7 to 9. 

30 35. Process for detecting the presence or 

exposure to a virus associated with multiple sclerosis or 
rheumatoid arthritis, in a biological sample, wherein said 
sample is brought into contact with a polyeptide, 
according to any one of claim 14 to 29, or an antibody of 

35 claim 30. 
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GTO33^T AhCCClCATC TCTTTQGTCA. <3GT7^CTCGCC CAAGATCT*G 50 
GQCACTICTC /OCTICCAG^ AC1CIGTYOC TICAG 8S 

S£0 ID NO 3 (POL msrv-ib) 

CTTCT^QQGAT JV3QQOCX2VXC TOTTTC330CA GQCTCTftQCT CAA37\CTTC& SO 
GOCT^TTCIC AJJOCTCQSaC AITICIVCTOC TICQST 86 

S£Q ID N04 < POL msrv-ib) 

CT*ICAP^GKr TOCOXCA!lC TMTITGQOCW OU^G?^CTTC& S3 

cvcwmeic iydcriQicc topc es 

S£<7 /2? M?5< POL msrv-ib) 

M»a?T33aC JCTCnGTOC 65 

S££ /J? NO€ (POL MSRV-1B) 



Consensus 
Consensus 

Consensus 



GTGTXQOCftC BG£3MO£CT CMX7IMITIG G¥CWRGttMfT 

HPCYCRAKAY YTRBGVCAVr TCTYAKKXSY FGSNAYTCTS KYOCTIYRGT 

S^J //J < POL MSRV-IB) 
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FiG.2 

CONSENSUS A SE Q ,D NO 3 

C^TAGGGATAGCCC TCATCTCTTTGGTCA CCTACTCCCC^ ^OGCCAC^ 

VrVs'p S h\ L F W C S Q V t A Q - 0 L Vo 
\ C I A L ISLV.R YWPKI A T S Q 



AGGTCCAGGCACTCT GTTCCTTCAG 
RSRHS VP. S 
GPGTL F L Q 
V Q A L C S F 



8S 



VrVs'p Vl^Vq 'aVaV* \\\\\ 

S G I A P I Y L A R H - L N T - A S S H 

86 

ATACCTGGACACTCT TGTCCTTCGGT 
I P G H S CPS 
Y L O T L V L R 
T W T L L S F G 

CONSENSUS C seq id no 5 

cctct^ca coc.mocccMC otc^ttctc eo 

v n <: . P P S I W P Q I S P R_ L fc * J- 



SGIAP IYLAR H-ri^> 

85 

ATACCTGGACACTCT TGTCCTTCAG 
IPGHS CPS 
YLOTL VLQ 
T W T L L S F 

CONSENSUS D seq id no 6 ArrrArTTrrr 6G 

GTTCAGGGATAGCTC CCATCTATTTGGCCT GGCATTAACCCGAGA CTTAACCCACTTCTC 

•\V^%V^v '.%•,'.'.«,',•. 

8S 

ATACGTGGACACTCT TGTCCTTTGG 
I R G H S C P L 
YVOTL VLW 
T W T I L S P 
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FlG.3 

Consensus TTOSA.TOCAG TGYIGGCACA GQOQQCTGAA GCCTMCQaG TOGftGrroGC SO 
Consensus GGKTQ33SCC *Dmv30CICT AOGTG3V2GA OCTSCTCA^G CTTCAG 96 

.S££ ID NO U 
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F IG.6 



CAAGCCACCC AAGAACICTT AAATTTCCTC ACT?OCTGTG GCTACAAGGT 
TIOCAAAOCA AAGGCTCAGC TCTGCICACA GGAGATTAGA TACTTAG3GT 
TAAAATIATC CAAAGGCACC AQQG3CCICA GTCAGGAAOG TATCCAGGCT 
ATACTGQGTT ATOCICATOC CAAAACOCIA AAQCAACTAA GAQQGTTOCT 
TAGCATGA1C AGGTTICTQC CGAAAACAAG ATTOOCAQGT ACAAOCAAAA 
TAGCCAGACC ATTATATACA CTAATIAAQG AAACICAGAA AQCCAATAOC 
TATTTAGTAA GATG3ACACC TAAACAGAAG GCITIOCAG3 OXTAAAGAA 
QGOXTAADC CAAGOOXAG TGTICAGCTT G0CAACAG3G CAAGATTTTT 
CTTTATATG3 CACAGAAAAA ACAG3AATCG CTCTAGGAGT CCTIACACAG 
GIQOGAGQGA TCAGCTTQCA AO00GTOGCA TAOCTGAATA AQGAAATIGA 
TCTAGTO3CA AAQ3GTIQQC CTCAT3SGTTT AT333TTAATC Q0332AGIAG 
CAGTCTNAGT ATCTCAAQCA GTIAAAATAA TACA333AAG- AGATCTTtCT 
GTCTGGACAT C1CATGATCT GAACQQCATA CICAC1GCTA AAGGAG&CTT 
GIGGTIGTCA GACAAOCATT TACTTAANIA TCAG3CTCTA TIACTTCAAG 
AQ0CAGTG2T GNGACIGOGC ACTIGTQCAA C1LTIAAACC C 
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TCAGGGATAGCCCCCATCTATTTGGCCAGGCATTAGCGCAAGACTTGAGTC 

AATTCTCATACCTGGACACTCTTGTCCTTCAGTACATGGATGATTTACTIT 

TAGTCGCCCGTTCAGAAACCTTGTGCCATCAAGCCACCCAAGAACTCTTAA 

CTTTCCTCACT ACCTGTGGCTACAAGGTTTCCAAACCAAAGGCTCGGCTCT 

GCTCACAGGAGATTAGATACTNAGGGCTAAAATTATCCAAAGGCACCAGG 

GCCCTCAGTGAGGAACGTATCCAGCCTATACTGGCTTATCCTCATCCCAAA 

ACCCTAAAGCAACTAAGAGGGTTCCTTGGCATAACAGGTTTCTGCCGAAA . 

ACAGATTCCCAGGTACASCCCAATAGCCAGACCATTATATACACTAATTA 

NGGAAACTCAGAAAGCCAATACCTATTTAGTAAGATGGACACCTACAGAA 

GTGGCTTTCCAGGCCCTAAAGAAGGCCCTAACCCAAGCCCCAGTGTTCAGC 

TTGCCAACAGGGCAAGA1 I 1 I 1 CI I I ATATGCCACAGAAAAAACAGGAAT 

AGCTCTAGGAGTCCTTACGCAGGTCTCAGGGATGAGCrTGCAACCCGTGGT 

ATACCT GAGTAAGGAAATTGATGTAGTGGCAAAGGGTT 



SEQ ID NO 8 (MOQ3-POC*) 
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CCC TTT GCC ACT ACA TCA ATT TTA CCiA GTA ACC AA CCC AAC CCA CAC TOG ACC TTA GTC CAA CAA CTC ACC 
PFATTS I L.GVRKP*JCQWRL-VOELR> 
a a a a a a . a TRANSLATION OF KSKV-1 POL - (Aj a a a « * * > 



60 



90 



100 



110 



120 



130 



K0 



ATT ATC AAT CAC OCT CTT CTT OCT CTA TAC CCA OCT GTA OCT AAC OCT TAT ACA GTC CTT TCC CAA ATA CCA 
IIM EAVVPLYPAVFNPYTVLSQI P> 
a a a a a a a . T RAJSIATTON OF KSRV-1 POL " . (Al aaaaaaa> 



ISO 



160 



170 



ieo 



CAC CAA OCA GAG TOG TTT ACA GTC CTC CAC CTT AAG 
EEAEVfFTVLDLK 



190 



200 



210 



CAT GCC TTT TTC TCC ATC OCT GTA OCT OCT CAC TCT 
DAFFCIPVRPDS> 



220 



230 



g TRANSLATION OF KSKV-1 POL " (Aj_ 
240 250 260 



270 



CAA TTC TTG TTT COC TTT CAA CAT OCT TTC AAC CCA AOS TCT CAA CTC ACC TCC ACT -CTT 
QF LFA-F ED P L N PTSQLTWT | V 

ION OF KSKV-1 POL - fAl a a 



TTA CCC CAA CCC ] 
L P Q C> 



290 



300 



310 



320 



33o|^ ^ 



340 



350 



360 



TTC AOS CAT AOC COC CAT CTA TTT GCC CAC CCA TTA COC CAM CAC TTC ACT CAA TTC TCA TAC CTC CAC ACT 
FR t> S P H L F G Q A LAQIdLSO F S Y X. D T> 

[ OF KSKV-1 POL * (Aj_ 



370 



380 



390 



400 



410 



420 



430 



CTT CTC CTT CAG|TAC ATC CAT CAT fTTA CTT TTA CTC CCC OCT TCA CAA ACC TTC TCC CAT CAA COC AOC CAA 



P 1 L, L L V A R S 
^TRANSLATION OF KSRV-1 POL - 
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CAA CTC TTA ACT TTC CTC ACT ACC TCT CCC TAC AAC CTT TCC AAA OCA AAC OCT COG CTC TCC TCA CAC CAC 
ELLTF I-TTCGYKVSKPXAR L C S Q E> 
a T RANSLATION OF KSRV-1 POL * (Aj_ 
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520 



530 
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SS0 



SCO 



570 



ATT ACA TAC TKA COC CTA AAA TTA TCC AAA COC ACC ACC CCC CTC ACT CAC CAA OCT ATC CAC OCT ATA CTC 
IRYX CL,Kl.SrCTRAX.SEERIQPIt>* 
a a A A TRAKSLATIOK OF KSRV-1 POL * fAl a a a 
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€40 



OCT TAT OCT CAT CCC AAA ACC CTA AAC CAA CTA ACA CCC TTC CTT CCC ATA ACA OCT TTC TCC CCA AAA CAC 
AYPHPKTLKCLRCFLCITG F C R K Q> 

_TRAKSLATIOPT OF KSRV-1 POL - (A]_ 
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ATT COC AOS TAC ASC CCA ATA CCC ACA OCA TTA TAT ACA CTA ATT AM3 GAA ACT CAC AAA COC AAT ACC TAT 
IPR.YX P lARPtiYTLXJCETQ KAHT Y> 

r OF KSRV-1 POL * (A3 A-. 
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740 



7S0 



760 



770 



780 



790 



TTA CTA ACA TCC ACA OCT ACA CAA CTC OCT TTC CAC OQC CTA AAC AAC OCC CTA ACC CAA OOC CCA CTC TTC 
LVRWT PTEVAFQALKKALTQAPVF> 

TON OF KSRV-1 POL * (Aj_ 



600 



eio 



620 



830 



840 



eso 



860 



ACC TTG CCA ACA COG CAA CAT TTT TCT TTA TAT OOC ACA CAA AAA ACA CCA ATA OCT CTA CCA CTC CTT ACC 
SL PTCQDFSLYATEKTGI A L C V L T> 

_TRAKSLATTON OF KSRV-1 POL - fAl * A 



870 



880 



890 



900 



910 



920 



930 



CAC CTC TCA CCC ATC ACC TTC CAA COC CTC CTA TAC CTC ACT AAC GAA ATT CAT CTA GTC CCA AAG OCT TCC 
OVSGMSLQPVVYLSKEXDVVAICCW> 

.TRANSLATION OF KSRV-1 POL « fAl a«aaaaa> 
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^TRANSLATION OF KSRV-1 POL - fAl a a a. a a 
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CAT CTT MCT CTC TCC ACA TCT CAT CAT CTC AAC CCC ATA CTC ACT OCT AAA CCA CAC TTG TCC TTC TCA CAC 
DLXVWTSHOVNCILTAKGD LWLSD> 
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FlG.13 

SEQ ID NO 46 (FBd3) 

GTGCTGATTGGTGTATTTACAATCCTTTATCTAATCCGAAATGCCCATGTTG 

CAATATGGAAAGAAAGGGAGTTCCTAACCTCTGGGGGAACCCCCATTAAA 

TACCACAAGTAAATCATGGAGTTATTGCACACAGTGCAAAAACTCAAGGA 

GGTGGAAGTCTTACACTGCCAAAGCCATCAGAAAAGGGAAGAGGGGAGAA 

GAGCAGCATAAGTGGCTACAGAGGCAAGGAAAGACTAGCAGAAAGGAAA 

GAGAGAAAGAGACAGAAAGTCAGAGAGAGAGAGAGGAAGAGACAGAGCA 

CAAAGAGGGAGTCAGAGAGAGAGAGAGACAGAGAGTCAGAGAGAAGGAA 

AGAGAGAGAGGAAGAGACAAAGAATGAATCAAACAGAGAGACAGAAAGT 

CAGAGAGAGAGAGAGAGAGGAAGAGACAGAGAAAAAGAGGGAGTCAGAA 

AAAGAGAGACCAAAGAAGAAGTCCAAAGAGAAAGAAAGAGAGATGGAAG 

TAGTAAAGGAAAAACAGTGTACCCTATTCCTTTAAAAGCCGGGGTAAATTT 

AAAACCTATAAITGATAACTGAAGGTCnTCTCTGTAACCCTGTAACACTCC 

AATACCACCTTGTTGTCAAGTGTAAACAAGGGCGTAGCCCAAAAGCACTG 

AGGCCACTAACAACCCATAGCCTTCCTATCAAAATTCCTTAACCCAGCAGG 

TTTCCTAACAGGGGATCTAAATCTTAATTAATTACCATACAATGGTCCAAC 

GAGACTTAGGAGGAATTCCCTTCAGGACGGGAAGATAGATGCTTCCTCCCA 

GGCGATTAAGGGAGAAAGACACAATGGGTATTCAGTAAGTGCCAAGGGGA 

ACACTTGTAGAAGCAAAGTTAGGAAAATTGCCAAATAATTGGTTTGCTCAA 

GAGTTGTTTGCACTCAGCCAAACCTTGAAGTACTTGCAGAATCAGAAAGGA 

GCCATCTATACCAATTCTAAGTTAATATGGACTG AAGGA GGTTTTATTAAT 

ACCAAAGAGAAATTAAAATCCCAAACTTATAAGGTTTTCAACCAAAGTAA 

AGTITGCTAAAAGTTAACAGCGTAACATGTATTATCCTACTACCACACACT 

CTCAAAGGATTTCTCAGACAGTTTGCAAGAAATAATGATATCTATCCTTAC 

TCTACAATCCCAAATAGACTCTTTGGCAGCAGTGACTCTCCAAAACCGTCA 

AGGCCT AG ACCTCCTC ACTGCTG AG AAAGG AGG ACTCTGC ACC1I cj i 1 A AG 

GGAAGAGTGTTGTCTTTACACTAACCAGTCAGGGATAGTATGAGATGCTGC 

CCGGCATTTACAGAAAAAGGCTTCTGAAATCAGACAACGCCTTTCAAATTC 

CTATACCAACCTCTGGAGTTGGGCAACATGGTTTCTTCCCTTTCTATGTCCC 

ATGGCTGCCATCTTGCTATTACTCGCCTTTGGGCCCTGTA 1 l i i i AACCTCC 

TTGTCAAATTTGTTTCTTCTAGGATCGAGGCCATCAAGCTACAGATGGTCTT 

ACAAATGGAACCCCAAATGAGCTCAACTATCAACTTCTACTGAGGACCCCT 

AGACCAACCCCCTGGCCCTTTCACTGGCCTAAAGAGTTCCCGTCTGGAGGA 

CACTACCACTGCAGGGCCCCATCTTTGCCCCTATCCAGAAGGAAGTAGCTA 

GAGCAGTCATTGCCCAATTCCCAAGAGCAGCTGGGGTGTCCCGTTTAGAGT 

GGGGATTGAGAGGTGAAGCCAGCTGGACTTCTGGGTCGGGTGGGGACTTG 

GAGAACTTTTGTGTCTAGCTAAAGGATTGTAAATGCAACAATCAGTGCTCT 

GTGTCTAGCTAAAGGATTGTAAATACACCAATCAGCAC 



WO 98/23755 



PCT/IB97/01482 



FlG. U 




6Ad3SM 



WO 98/23755 



PCT/IB97/01482 



FIG. 15 

SEQ ID NO 51 (tpol) 



GGCTGCTAAAGGAGACTTGTGGTTGTCAGACAATCGCCTACTTAGGTACCA 

GGCCTTATTACTTGAGGGACTGGTGCTTCAGATGCGCACTTGTGCAGCTCT 

TAACCCAAAGTITATGCTGCCCAGAAGGATCTTTTAGAGGTCCCCTTAGCCA 

ACCCTGACCTCAACCTATATATATACTGATGGAAGTTCGTTTGTAGAAAAG 

GGATTACAAAGGGNAGGATATNCCATAGGTTAGTGATAAAGGAGTACTTG 

AAAGTAAGCCTCTTCCCCCCAGGGACCAGCGCCCCCGTTAGCAGAACTAGT 

GGCACTGACCCCGAGCCTTAGAACTTGGAAAGGGAGGAGGATAAATGTGT 

ATACAGATAGCAAGTATGCTTATCTAATCCGAAATGCCCATGTTG 



1 
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SEQ ID NO 52 (JLBcl) 
TCAGGGATAGCCCCCATCTATTTGGTCAGGCACTGGCCCAAGATCTAGGGA 

CATGCCACTTTTAAGAGCCATTTCTCAAGTCCAGGTACTCTGGTCC'i-rCGGT 

ATGTGGATGA7TTACTTTTGGCTACCAGTTCAGTAGCCTCATGCCAGCAGG 

Cl-ACTCTAGATCTCTTGAACTTTCTAGCTAATCAAGGGTACAAGGCATCTA 

GGTTGAAGGCCCAGCTTTGCCTACAGCAGGTCAAATATCTAGGCCTAATCT 

TAGCCAGAGGGACCAGGGCACTCAGCAAGGAACAAATACAGCCTATACTG 

GCTTATC r T"CACCCTAAGACATTAAAACAGTTGCGGGGGTTCCTTGGAATC 

ACTGGCTTTTTGGTGACTATGGATTCCCAGATACAGCAAGATTGGCAGGCC 

CCTCTATACTGTAATCAAGGAGACTCACGAGGGCAAGTACTCATCTAGTAG 

AATGGGAACTAGGGACAGAAACAGCCTTCAAAACCTTAAAGCAGGCCCTA 

GTACAATCTCCAGCTTTAAGCCITCCCACAGGACAAAACTTCTCTTTATAC 

atcacagagagggcagagatagctcttggtgtccttattcagactcatggg 

ACTACCCCACAACCAGTGGCACACCTAAGTAAGGAAATTGATGTAGTAGC 
AAAAGGCrGGCCTCACTGTTTATGGGTAGCTGTGGTGGTGGCTGTCTTAGT 

gtcagaagctatcaaaataatacaaggaaaggatctcactgtctggacta 

ctcatgatgtaatggcatactaggtgccaaaagaagtttatgggtatcaga 

caaccacctgcitagataccagggactactcctggaggattgggcitcaag 

tgcgttttttgtggcctcaaccctgccacttttcctccagaggatggagag 

ccgcttgagcatgcttgccaacaggttgtaggccagaattattccacccga 

gatgatctcttagagtacccttagctaatcctgaccttaacctatatacca 

atggaagttcatttgtggaaaacgggatatgaagggcaggttatgtcatag 

ttagtgatgtaatcatacttgcaagtaagcctcttaccccaggggccagca 

ctcagttagcagaactagtcacacttaccttaaccttagaactggga^aagg 

gaaaaagaataaatatgtatacagatagtaagtatgcttatctaat^^ 

atgcccatgctgcaatatggaaggaaagggagttcctaacccctggggga 

acccccattaaataccacaaggyaaatcatggagttattgcacgcagtgc 

aaaaactcaaggaggtggcagtcttacactgccgaagcyatcaaaaaggg 

gaaggagaggggagaacagcagcataagtggttggcagaggcagtgaaa 

gaccagcagagagaaggagagagacaacgtcaacgacagaaggaaagaa 

gaggaggagacagagaggaagagacagagagacagttagtcgaagagag 

agacagagagaggaagagacagacagaaagtccaagagagaaggaaaga 

gaggaagagaccaaggagtccnagagagagaaagagatagaagtagtaa 

agaaaaa^cattgtaccctattcctttaaaagccggggtatatitaaaacc 

TA^TOATAATTGAGTIXriTGCACCCTCCTCCAGGGGATYGCTGGGAGG 
AAACCCTCAACCGATATGTGAAAATTGTGGGTCGTCCCTATGTCTCAATTA 
CCAGCCAATACCCC C1TG1 1 1 1 1 AGTGTG AACG AGGGTGTAGAGCGCAG AC 

agggagacctctgacaatccatacccxtcctatccaaaatccttaacccag 

CAGG^mCTAAAAGGGGATCTAAATCTTAATTAATTACCATACAAAGGTC 

aaaccagatctaggaggaacttccttcaggacaggatgatagatggttcct 

CCCAGGCGATTAAAGAAAATAAAAAGACACATGGGCAGCCAGTAAGTGAT 

aagggaacactagtagaagcagttaggagaacsttgc^ataat^gctct 

ACTCCAAATGTGTGAGTTGTTCGCACTCAGCCCAAATCrTAAAGTACr^C 

AGAATTAGGGAGGAGCCATITACACCAATTCTAAGTTAATATGGACrrGGAT 

GAGGTTTTATTAATAGCGAAGGAGAATTAAATCCTAAACTNACAAGGTTTT 

CAACTAAAGTAAATTTTACTAAAAGCTAACAGTGTAACATGCATTATCCTA 

CTA^CAACACACTCTCANAGGATTCCTCAGACAGTTTACAAGAAATAACAA 

A^TCTATC^GGTAAGGATAGTAACTACAATCCCAAATACATTCTTTGGCAG 

CAGTGACTCTC 
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SEQ ID NO 53 (JLBc2) 
TCAGGGATAGCCCCC.\TCTATTTGATCAGGCACTAGCCCAAGATCTAGGCC 

ACTTCTG a agtcc aggc attct agtccttc agt atgtgg atg atttactttt 

ggctaccagtttggaagcctcatgccagcaggctacttgagatctcttgaa 

cittctagctaatcaagggtgtatggcatctaaattgaaagtccagctctg 

cctacaacaagtcaaatatctaggcctaatcttagatagaagaaccaggg 

ccctcagcaaggaatgaataaagcctatgctggcttatcggcaccctaaga 

cattaaaacaattgtgggggttccitggaatcactggcttttgccgactat 

ggatccctggatagagtgagatagccaggccccctctattactcttatcaa 

ggagacccagagggcaaatacttatctagtattatgggnaccagaggcag 

aaaaagccttccaaaccttaaaggagaccctagtacaagctccagctttaa 

gccttcccacaggacaaancttctctttatatgtcacagagagagcaggaa 

TAGCrCCTGGAGTCCTTACTCAGACTTrTGGACGACCCCACGGCCAGTGGC 

rtacctaagtaaggaaattgatgtagtagcaaaaggctggcctcactgttt 

ATGGGTAGTTGCGGCTGTGGCAGTCTTACTGTCAAAGGCTATCAAAATAAT 

acaaggaaaggatttcactatctggactactcatgaggaaaatggcatatt 
aggtgccaaaggaagtttttggctatcagacaaccacctgctcagattcca 

GGCACTACTGATrGAGAGACCAGTGCTTTAAATATGTATGTGTGTGTGTGG 

ccctcaaccctgccactgttctcccagaagatggagaaccaatgaagcatt 

actgtcaacaaattagagtccagagttatgctgcctgagaggatctcttag 

aagtccccitagctaatcctgaccttaacctatatgctgatggaagttcac 

ttgtggagaatgggatacgaaaagcacattatgccatagttagtgaggta 

acagtacttgaaagtaagcctattcccccatggaccagagcccagttagca 

gaactagtggcacttacccaagccttagaactaggaaagggaaaaataat 

aaatgtgtatacagatagcaagtatgcttatctaatcctacatgcccatgc 

tgcagtatggaaagaaagggagttcctaacctctgggggaacccccatta 

aataccacaaggcaaatcatggagttattgcatgtagtgcaaaacctcaa 

gtaggtggcagttttacactgcctgaagctatggggaaggagagaggaga 

acagcagcataagtggctagcagaggcagcgaaagactagcagagagga 

gaggtaggggaaagacagaaagtcaaagaaaagaagtcaaagacagaca 

gagaaagagacagagggagccagagagaaagaaaagagagaacgaaaga 

gacagaatgtcaaagaacagaagagagaggcagcgccagaagagttaag 

aaagtgagaaagagagatggaaatagtaaagaa aaaac agtgtaccctat 

tcctttaaaagccagggtaaatttaaaacgtataattttataattggaagg 

tcttctccataaccctataacattaaaataccaccttgttgtcagtgtaaac 

aagagcatagcccaaaagcactgaggccactgacaacccatagccttcct 

atcaaaaatccttaactctgcaggtttcctaacaggggatctaaatctcaa 

ctaatcaccatacaatggtccgaccagacctaggagcgactcccctcagg 

acagaaggatggatggttcctcccaggccattaagggaaagagacacaat 

gggtattcagtaagtgataagggaactcttgtagaagcagttaggaagatt 

gcctaatatttggtctgctcaaatgtgccagctgtttgcactcagctaaac 

cttaaattacttacagaattaggaaggagccatctataccaattctgagtt 

AATATGAGCTGAACAAGTTCTTATrAATAGCAAAGAATCATrGAAATCTCA 

AACTTGCAAACnTTTCAACAAAAGTAAAGTTTGCTGAAAGTTAGCAGTGTA 

ACATGTATTATCCTAACTTCTAATCTTGTGGAAATCAGACCCTATCAGTGC 

CCCTCAAAGCTGAAGTCCATCAGCATATGGCCATACAACTAATACCCCTAT 

TrATAGGGTTAGGAATGGCCACTGCTACAGGAATGGGAGTAACAGGTTTAT 

CTACITCATTATCCTATTACCACACACTCTTAAAGGATTTCrCAGACAGTTT 

ACAAGAAATAACAAAATCrrATCCTTACTCTNTARTCCCAAATAGRTTCTTT 

GGCAGCAGTGACTCTC 
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1 TTCCTGAGTT CTTGCACTAA CCTCAAATGA GAGAAGTGCC GCCATAACTG CAACCCAAGA 

61 GTTTGGCGAT CCCTGGTATC TCAGTCAGGT CAATGACAGG ATGACAACAG AGGAAAGATA 

121 ATGATTCCCC ACAGGCCAGC AGGCAGTTCC CAGTGTAGAC CCTCATTAGG A C AC AG AATC 

181 AGAACATGGA GATTGGTGCC GCAGACATTT GCTAACTTGC GTGCTAGAAG GACTAAGGAA 

241 AACTAGGAAG ATATGAATTA TTCAATGATG TCCACTATAA CACAGGGGAA AGGAAGAAAA 

3 01 TCCTACTGCC TTTCTGGAGA GACTAAGGGA GGCATTGAGG AAGCATACCA GGCAAGTGGA 

3 61 CATTGGAGGC TCTGGAAAAG GGAAAAGTTG GGAAAAGTAT ATGTCTAATA GGGCTTGCTT 
421 CCAGTGTGGT CTACAAGGAC ACTTTAAAAA AGATTGTCCA ATAGAAATAA GCCACCACCT 

4 81 CGTCCATGCC CCTTATGTCA AGGGAATCAC TGGAAGGCCC ACTGCCCCAG GGGATGAAGG 
541 TCCTCTGAGT CAGAAGCCAC TAACCAGATG ATCCAGCAGC AGGACTGAGG GTGCCCGGGG 
601 CAAGCGCCAG CCCATGCCAT CACCCTCACA GAGCCCCAGG TATGCTTGAC CATTGAGGGT 
661 CAGAAGGGTA CTGTCTCCTG GACACTGGCG GGCCTTCTCA GTCTTACTTT CCTGTCCTGG 
721 ACAACTGTCC TCCAGATCTG TCACTGTCCG AGGGGTCCTA GG AC AGCCAG TCACTAGATA 
781 CTTCTCCCAG CCACTAAGTT GTGACTGGGG AACTTTACTC TTCCACATGC TTTTCTAATT 
841 ATGCCTGAAA GCCCCACTCT CTTGTTAGGG GAGAGACATT CTAGCAAAAG CAGGGGCCAT 
901 TATACATGTG AATATAGGAG AAGGAACAAC TGTTTGTTGT CCCCTGCTTG AGGAAGGAAT 
961 TAATCCTGAA GTCCGGGCAA CAGAAGGACA ATATGGACAA GCAAAGAATG CCCGTCCTGT 

1021 TCAAGTTAAA CTAAAGGATT CCACCTCCTT TCCCTACCAA AGGCAGTACC CCCTCAGACC 

1081 CGAGACCCAA CAAGAACTCC AAAAGATTGT AAAGGACCTA AAAGCCCAAG GCCTAGTAAA 

1141 ACCAAGCAAT AGCCCTTGCA AGACTCCAAT TTTAGGAGTA AGGAAACCCA ACGGAC 



SEQ ID NO 56 (GM3) 
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FIG. 28 



GATGCCTTTTTCTGCATCCCTGTACGTCCTGACTCTCAATTCTTGTTTGCCTTTGAAG 

ATCCTTTGAACCCAACGTCTCAACTCACCTGGACTGTTTTACCCCAAGGGTTCAGGGA 

TAGCCCCATCTATTTGGCCAGGCATTAGCCCAAGATGCCTTTTGCATCCCTGTACGTG 

ACTCTCAATTCTTGTTTGCCTTTGCCTTTGAAGATGCTTTGAACCCAACGTCTCAACT 

CACCTGGACTGTTTTACGCCAAGGGTTCAGGGATAGCCCCCATCTATTTGGC 

CAGGCATTAGCCCAA SEQ ID NO 40 



Asp-Ala-Phe-Phe-Cys-Ile-Pro-Val-Arg-Pro-Asp-Ser-Gln-Phe- 
Leu-Phe-Ala-Phe-Glu-Asp-Pro-Leu-Asn-Pro-Thr-Ser-Gln-Leu- 
Thr-Trp-Thr-Val-Leu-Pro-Gln-Gly-Phe-Arg-Asp-Ser-Pro-His- 

Leu-Phe-Gly-Gln-Ala-Leu-Ala-Gln 

SEQ ID NO 3 9 (POL2B) 
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OVlVOOdl 
\TlVD9dlH 
IVOOdlHd 
VOOdlHdS 

oodiHdsq 
odiHdsqa 
diHdsayd 
iHdsaadr 
Hdsaydo _ 
dsaadoqd 

aadoodiA 

HdSOdlAl 
dOOdlAJLM 
OOdlAXMX 

odi/uvyvri 

dlAXMXlO 
lAiTVaiOS 
AXMXIOSX 
XMXlOSXd 

MXiOSXdN 

XlOSXdNl 
lOSXdNld 

osxdNidq 
sxdNidqd 

XdNldOBd 

dNidaddy 
Nidcedvd 
ida3dvdi 

d03dVdld 
a3dVdldO 
ddVdldOS 

dvd-idosa 
vxidosad 
didosada 

IdOSadHA 
dDSadHAd 
OSadHAdl 
SadHAdlO 
ddHAdlOd 
dHAdlOdd 
HAdlOddV 

Adioddva 
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dAdlOddV 
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Cys-Ile-Pro-Val-Arg-Pro-Asp-Ser-Gln-Phe-Leu SEQ ID NO 4 1 



Val-Leu-Pro-Gln-Gly-Phe-Arg-Asp-Ser-Pro-His-Leu-Phe-Gly- 

i SEQ ID NO 4 2 

Gin-Ala-Leu-Ala ^ 



Leu-Phe-Ala-Phe-Glu-Asp-Pro-Leu 
Phe-Ala-Phe-Glu-Asp-Pro-Leu-Asn 



SEQ ID NO 43 
SEQ ID NO 44 
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10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 

CTIOOQCAAC TAATAAGGAC QOOQCTTTCA AOGCAAACAG TOCAAAAGGA 50 
LPQL IRT PLS TQTV QKD 
FPN . .GP PF Q PKQ SKRT 
SPT NKD PPFN PNS PKG 



CATAGACAAA GGACTAAACA ATGAADCAAA GAGIGQCAAT ATItXXJIQCT 100 
IDK GVNN EPK SAN IPWL 
T K E.T MNQR VPI FPG 
H R Q R SKQ .TK ECQY SLV 

TATGCADQCT CQ^AGOGCTG QGAGAAGAAT TOQQQQCAGC CAGAGTOCAT 150 

CTL Q A V GEEF GPA RVH 
YAPS KRW EKN SAQP ECM 
MHP PSGG RRI RPS QSAC 

GTADCTTTTT CICICICACA CITGAAQCAA ATEAAAATAG ACNEAGGINA 200 
VPFS LSH LKQ IKID XGX 
YLF LSHT .SK L K T.VN 
TFF SLT LEAN N R XRX 



ATINICAGAT AGCXX7IGATC GYTATATTGA TGTTTEACAA QGATTAGGAC 250 
XSD SPDG YID VLQ GLGQ 
XQI A L M XILM FYK D.D 
I X R . P ♦ W L Y CFTR IRT 



AATCCTTIGA TCIGACATGG AGAGATATAA TATTACIGCT AAATCAGAOG 300 

S F D L T W R D I I L L L N Q T 
NPLI .HG EI. YYC. IRR 
IL. SDME RYN ITA KSDA 

CIAAOCICAA ASGAGAGAAG TGCIGGCAIA ACIGGAGOOC GAG&GITIGG 350 
LTSN ERS A A I TGAR EFG 
. P Q MREV LP. LEP ESLA 
NLK E K CCHN WSP RVW 

CAAIUICIGG TAICICAGIC AGCTCAATOA TAGGATGACA AOGGAGGAAA 400 
NLW YLSQ VND RMT TEER 
ISG ISV RSMI G.Q R R K 
QSLV SQS GQ. D D N G G* : K 

GAGAAGGATT GO0CACAGQG CAGCftGQCAG TTOOCAGIGT AGCTOCTCAT 450 

ERF PTG QQAV PSV APH 
ENDS PQG SRQ FPV. LLI 
RTI PHRA AGS SQC SSSL 

TQGGACACAG AATCAGAACA T3GAGATIQG TGOOQCAGAC ATTTA 495 
WDTE SEH GDW CRRH L 
GTQ NQNM EIG A A D I 
GHRIRTWRLVPQTF 
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12?45679?Q 1234567890 1 7. 34567890 12^4567890 

CTTCODCAAC TAATAAQ3AC COCCCITItZA AOCCAAACAG TCCAAAAGGA 50 
LPQL IRT PLS TQTV QKD 

CATAGACAAA G3£GTAAACA ATCAACCAAA GAOT30CAAT jOTOOCriOgr 100 
IDK GVNN EPK SAN IPWL 

TATGCAOXT CCAAGOQGIG G3AGAAGAAT TOQGGCX^OC CAGAGIQCAT 150 
CTL QAV GEEF GPA R V H 

GEACUiTi-iT CICICiCACA CTPSAAGCAA ATTAAAATAG ACCTAOJCAA 200 
VPFS LSH LKQ I K I D LGK 

M-iUlvjAGAT AGCITCICATC GYTAIATIGA lUl'l ' l ' lA CAA QGA2TMGAC 250 
FSD SPDG YID VLQ GL GQ 

AATOCTTIGA TCTCACATOG AG&GAIATAA TATTACTOCT AAAICftGACG 300 
SFD LTW RDII LLL NQT 

CTAACCTCAA ATGAGAGASG IXXTOOCAEA ACTOGftOXC G&GAGITIQG 350 
LTSN ERS A A I TGAR EFG 

CAATCICK33 TATCICA3TC MSICAATCA TMGATCACA ACOGMGAAA 400 
NLW YLSQ V N D RMT TEER 

GAGAAOGATT (XXX2CAGQ3 CA3CAQ3^A3 TICCCA^OTT A3CT0CTCAT 450 
ERF PTG QQAV PSV APH 

TO3SACACAG AATC^GAACA H33AGA1TOG TOCX23CAGAC ATTTACAACT 500 
WDTE SEH GDW CRRH LQL 

TOCCTOCEAN AAQGACIttP£ GAAAACTAQ3 AAGACTAN3A ATTATICAAN 550 
ACX KDXG KLG RLX IIQX 

GATOTCCACT ANNACACAG3 GGAAAGGA^ AAAATOCTAC TOOCTITCTG 600 
CPL XHR GKEE NPT AFL 

GAGAGACTAA G3GAG3CATT GA3GAA3CAT AOC&3QCAAG TOGACATIQG 650 
ERLR EAL RKH TRQV DIG 

AG30ICI03A AAAG32AAAA GTTGGQCAAA TEAIATOOCT AATAG3GCTT 700 
GSG KGKS WAN YMP NR. AC 

ULTIUaOIG CACTCTACAA QGA03CTTIA GAAAAGATIG TOCAAOTA3A 750 
FQC SLQ GRFR KDC PS R 

AAaWCOOQC OCCIOOTOCA TGOOCLTiAT GICAAGGGAA TCACIQ3AAG 800 
NKPP I> V H APY VKGI TGR 

G3CTACIQ0C 02AQ33GA03 AAGGflCCTCT OCACIAAOCT 850 

PTA PGDE GPL SQK PLT. 
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10 20 30 40 50 
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P \Qj 38 A£GGAAACTC AGAAAGCCAA TADCCATITA GIAA3AIGGA CAOCAGAAGC 50 
Q KETQ KAN THL VRWT PEA 

RKL RKPI PI. .DG HQKQ 
GNS ESQ YPFS KMD TRS 

A3AAQCAGCT TIOCAGGQOC TAMGAAATC CCEAAOCCAA GCmZAGIGT 100 
E A A FQAL KKS LTQ APVL 
KQL SRP R N P . P K PQC 

RSSF PGP K E I P NPS PSV 

TAMCTIGOC AAOQQQQCAA GftLTITlt-'lT TATAIGTCAC AGAAAAACAG 150 

SLP TGQ DFSL YVT EKQ 
.ACQ RGK TFL YMSQ KNR 
KLA NGAR LFF ICH RKTG 

GAATA3OTCT AGGAGTOCTT ACACMGTCC AAGGGACAAG CTIX3CAACCT 200 
E.L. ESL HRS KGQA CNL 
NSS RSPY TGP RDK L.ATC 
I A L GVL TQVQ GTS LQP 

GIQ3CATACC TGAGOAAQGA AA^IGAIGIA NIGQCAAAGG GITQGOCTCA 250 
WHT V R K LMX WQR VGLI 

GIP E.G N.CX GKG LAS 
VAYL SKE TDV XAKG WPH 

Tl Ul TliA CfiG GTAGGQGAGC TTACTTTCIG AAACAGITAA 300 

VYR .GS SSSL SF. N S . 
LFTG R A A VAV LVSE TVK 
CLQ VGQQ . Q S F L KQLK 

AATAATAC^G GGAAGAGATC TTACIGIGIG GACATCTCAT GATGIGAAGG 350 
NNTG KRS YCV DIS. CER 
I I Q GRDL TVW TSH DVNG 
. YR EEI LLCG HLM M.T 

GCmACTICAC TOCTAAAGAG GACITGIGGC TGTCAGACAA OCATITACTr 400 
HTH C.RG L V A V RQ P F T 
ILT AKE DLWL SDN HLL 
AYSL LKR TCG CQTT I Y L. 

AAATA3CAGG TTXHATIACT TCAAGIGOCA GIQCIGOGAC TOCACAITIG 450 

I A G SIT .SAS A A T AHL 
K.QV LLL EVP VLRL HIC 
NSR FYYL KCQ CCD CTFV 

TOCAACIUTT AAOOCAGOCA CKITIUI'IOC AGACAATGAA GAAAAGATAG 500 
CNS. PSH ISS RQ.R KDR 
ATL NPAT FLP DNE EKIE 
QLL TQP HFFQ TMK KR. 
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P IvjOO AACATAACIG TCAAGAAGTA ATTOCTCAAA CCTAIIGCIGC TCGAGQQGAC 550 
l-v T . L STSN CSN LCC SRGP 
U HNC QQV IAQT YAA RGD 
N I T V N K . LLKPM L LEG T 

1 

CTIUEAGAGG TIOXTIGAC TCATOOCGAC CTCAACTIGT AIACTGATOG 600 

SRG SLD . S R P QLV Y.W 
LLEV PLT DPD LNLY TDG 
F.R F P . L IPT STC ILME 



AAGITOCTIG QCA3AAAAAG GACITIGAAA AGQQG3GTAT GCAGIGATCA 650 
KFLG RKR TLK SGVC SDQ 
SSL AEKG L.K AGY AVIS 
VPW QKK DFEK RGM Q.S 

GIGATAATOG AATACTTGAA AGTAATCGQC TCAC7ICCAGG AACEAGIGCT 700 
• . W NT.K . S P HSR N.CS 
DNG ILE SNRL TPG TSA 
VIME YLK VIA SLQE L VL 

CAQCTQGCAG AAOTAATA3C QCICACTIQG GGACTAGAAT TSGGAGAA3G 750 

PGR TNS PHLG TRI RRR 
HLAE L I A LTW ALEL GEG 
TWQ N..P SLG H.N EKE 

AAAAAGQGIA AATATATATT CAGACICIAA CTATOCTTAC CTAGTCUIOC 800 
K K G K , Y I F ...R....L.. . V C L P JS ... P . P 
KRV NIYS DSK YAY LVLH 
KG. IYI QT LS MLT . S S 

A3X30C3CATOC AGCAATATOG AGAGAGAGQG AAITOCIAAC TICIGAGGGA 850 
CPC SNME REG IPN F.GN 
AHA A I W RERE FLT SEG 
MPMQ QYG ERG NS.L LRE 

ACACCTATCA AGCATCAQQG AA30CATEAG GA3ATTATIA TIGGC7IGTAC 900 

TYQ PSG KPLG DYY WLY 
TPIN HQG SH. E I I I GOT 
HLS TIRE AIR RLL LA VQ 

AGAAACCTAA AG3GGIQGCA GIUITACACT GOCAQGGICA TCAGGAAGAA 950 
RNLK RWQ SYT ARVI RKK. 
ET. RGGS LTL PGS SGRR 
KPK EVA VLHC QGH QEE 



GAGGAAAQ3G AAATAGAftGG CAATO30CAA Q03GATATIG AMCAAAAAA 1000 
RKG K.KA I A K RIL KQKK 
GKG NRR QSPS GY. SKK 
EERE IEG NRQ ADIE AKK 
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AGOCQCAAGG CAGGACTCTC CATEAGAAAT GCTTATAGAA QSAQCOCTAG 1050 

PQG RTL H.KC L.K DP. 
SRKA GLS IRN AYRR TPS 
AAR QDSP LEM LIE GPLV 

TATGGGGTAA TCOCCICIGG GAAAQCAAQC CCX^GTACIC AGCAGGAAAA 1100 
YGVI PSG KPS PSTQ QEK 
MG. SPL.G NQA PVL SRKN 
WGN PLW ETKP QYS AGK 

ATAGAATAGG AAACCTCACA AGG&CATACT TTCCICOCTT OGAGATGGCT 1150 
.NR KPHK DIL SSP PDG. 
RIG N L- T R . T Y^-.-F... E P. L Q M A 
IE.E TSQ GHT FLPS RWL 

AGCCACIGAG GAAQGAA 1167 

P L R K E 
S H . G R 
ATE EG 
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AACTIGCGTG CTAGAAGGAC TAAG3AAAAC TAGGAAGACT ATGAAITATT 50 
NLRA RRT KEN E D Y ELF 

TCV LEGL RKT RKT MNYS 
PIG 39 LAC K D .GKLGRL .II 

Q CAATCAICTC CACTATAACA CAQQQGAAAG GAAGAAAATC CTACTGCCTT 100 
NDV HYNT GER KKI LLPF 
MMS TIT QGKG RKS YCL 
Q.CP L.H RGK EENP TAF 

TCTGGAGAGA CTAAGGGAGG CATTGAGGAA GCATACCAGG CAAGTGGACA 150 

WRD .GR H.GS IPG KWT 
SGET KGG IEE AYQA SGH 
LER LREA LRK HTR QVDI 



TTGGAQGCIC TQGAAAAGGG AAAAGTTGQG CAAATTGAAT GCCTAATAGG 200 
LEAL EKG KVG QIEC LIG 
WRL WKRE KLG KLN A. .G 
GGS GKG KSWA N.M PNR 

G LT1 U L .T10C AGTGCAGTCT ACAAG3AC0C TITAGAAAAG ATIGICCAAG 250 
LAS SAVY KDA LEK IVQV 
LLP VQS TRTL .KR LSK 
ACFQ CSL QGR FRKD CPS 

TAGAAATAAG CCGCCCCTCG TCCATGOOCC TTATGTCAAG GGAATCACTG 300 
EIS RPS SMPL MSR ESL 
K . A APR PCP LCQG NHW 
RNK PPLV HAP YVK GITG 

GAAGGOCTAC TGCCCCAGGG GACGAAGGTC CTCTGAGTCA GAAGCCACTA 350 
EGLL PQG TKV L.VR SH. 
KAY CPRG RRS SES EATN 
RPT APG DEGP LSQ KPL 

ACCIGATGAT CCAQCAGCAG GACIGAGQGT QOQCGQGGCA AGIGCCftGOC 400 
PDD PAAG LRV PGA SASP 
LMI QQQ D.GC PGQ VPA 
T. .S SSR TEG ARGK CQP 

CATGCCATCA CCCTCAGAQC CCCOQGIATG TTTGACCATT GAGAGCCAGG 450 

CHH PQS PGYV .PL RAR 
HAIT LRA PGM FDH. EPG 
MPS PSEP RVC LTI ESQE 

AAGITAACTG TL'ICCIGGAC ACT33CX3CAG CCTICTCAGT CTTACTTTCC 500 
KLTV SWT LAQ PSQS YFP 
S.L SPGH WRS LLS LTFL 
VNC LLD TGAA FSV LLS 
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Flfn TGTCCCAGAC AATTGTCCTC CAGATCIGTC ACIAICCGAG QQGJICXlTAfiG 

' ,VJ VPD NCPP DLS L S E GS.D 

D SQT IVL QICH YPR GPK 

CPRQ L S S RSV TIRG VLR 

ACAGCCAGTC ACTACATACT TCTCICAQCC ACTAAGTIGT GACTQQQSAA 600 

SQS LHT SLSH . V V TGE 
TASH YIL LSA TKL. L G N 
QPV TTYF SQP L S C DWGT 

CTITACTCTT TICACATQCT TTTPCTAATTA TGCCTGAAAG C03ZACTCCC 650 
LYSF HML F.L CLKA PLP 
FTL FTCF SNY A.K P H S L 
LLF SHA FLIM PES PTP 



TlUl ' lfl GGGA GAGACATTTT AQCAAAAQCA GQ3GOCATEA. TACAOCTCAA 700 
C.G ETF. QKQ GPL YT.T 
VRE RHF SKSR GHY TPE 
LLGR DIL A K A GAII HLN 

CATAQGAAAA GGAATACCCA TTIGCTGTCC GCTOCTTGAG GAAGGAATTA 750 

.EK EYP FAVP CLR K E L 
HRKR NTH LLS PA.G RN. 
IGK GIPI CCP LLE EGIN 

ATOCTGAAGT CTQQGCAATA GAAGGACAAT ATGGACAAGC AAAGAATGCC 800 
ILKS GQ. KDN MDKQ RMP 
S.S LGNR RTI WTS KECP 
PEV WAI EGQY GQA KNA 

CGTCCTGTTC AAGTTAAACT AAAGGATICr QOCTaCTTIC OCTACCAAAG 850 
V L F KLN. RIL PPF PTKG 
SCS S.T KGFC LLS LPK 
RPVQ VKL KDS ASFP YQR 

GAAGTACOCT CTEAGACOOG AGGCCCTACA AGGACTCAAA AGATIUITAA 900 

STL LDP RPYK DSK DC. 
EVPS .TR GPT RTQK IVK 
KYP LRPE ALQ GLK RLLR 

QGACCTAAAA GCCCAAQQCC TAGTAAAACC ATGCAGTAGC COCTQCAATA 950 
GPKS PRP SKT MQ.P LQY 
DLK AQGL VKP CSS PCNT 
T.K PKA . N H AVA PAI 

CTCCAATTTT AQGAGTEAAGG AAACCCAACG GACAGTGGAG GTTAGTGCAA 1000 
SNF RSKE TQR TVE VSAR 
PIL GVR KPNG QWR LVQ 
LQF.E.GNPTDSGG.CK 
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C GATCICAGGA TTATEAATCA QQCTGITITT O^ICIAIAOC C^GCIGIATC 1050 

SQD Y. . GCFS SIP SCI 
D L R I INE A V F PLYP AVS 
ISG L L M R LFF LYT Q L Y L 

TAGOOCTTAT ACICIGCTTT COTIAATACC AGAQGAAGCA GAGTEAGITEA 1100 
. PLY SAF PNT RGSR VVY 
SPY TLLS LIP EEA E.FT 
ALI1/CF P.YQ RKQ SSL 

CAGIXXTOGA OTITAAGGAT GOCTLTl'lCT GCATOCX^IGT ACATOC7IGAT 1150 
SPG P.GC LFL HPC TS.F 
VLD L. K D ASFC IPV HPD 
QSWT LRM PLS ASLY ILI 

TC7ICAATICT TUITIUIL'IT TGAAGATOCT TIGAACQCAA TCICICAATT 1200 

SIL VCL R S F EPN VSI 

SQFL FVF EDP LNPM SQF 
LNS CLSL K I L . T Q CLNS 

CADCTOGACT G?lTrilACCXX: AGGGGTTCXDG GGATAGQCDC CAICIA3TIG 1250 
HLDC FTP GVP G.PP S I W 
TWT VLPQ GFR DSP HLFG 
PGL FYP RGSG IAP IYL 

GCCAQGCAIT AGCDCAAGAC TTGAGQCAAT TCICATADCT QGACATCITG 1300 
PGI SPRL EPI LIP GHLV 
QAL AQD LSQF SYL DIL 
ARH. PKT .AN S H T W TSC 

laJLTOOGEA TQQGATCATT TAATITIEGC CAOXCTICA GAAACCTIGT 1350 

LRY G M I . F . P PVQ KPC 
SFGM G . F NFS HPFR NLV 
PSV WDDL I L A TRS ETLC 

GQCATCAAGC CAOOCAAQQG TIUITAAAIT TCCTCACTOC: GTCIQQCTAC 1400 
A I K P PKR S.I SSLR VAT 
PSS HPSV LKF PHS VWLQ 
HQA TQA FLNF LTP CGY 

AAQGITIXXA AADCAAAGGC TCAGCTCIGC TCACA3CAGG TEAAATACTT 1450 
RFP NQRL SSA HSR LNT. 
GFQ TKG SALL TAG . I L 
KVSK PKA QLC SQQV KYL 

A3QGTIAAAA TIATCCAAAG GCADCAQQGC OCTODGTIGAG GAAT3EATOC 1500 

G.N YPK APGP SVR NVS 
RVKI I Q R HQG PL.G MYP 
GLK LSKG TRA LCE ECIQ 
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AAOCTGIACT GQCTEATCIT CATOOCAAAA CDCTAAAGCA ACTAAGAAGG 1550 
NLYW L I F IPK P.SN .EG 
TCT GLSS SQN PKA TKKV 
PVL AYL HPKT LKQ LRR 

TOC7TIG3CAT AACAGGTTIC T3Q03AA 1577 
PWH NRFL P 
LGI TGF CR 
SLA. QVS AE 
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TD2AGGAGCA GGACIGAQQG TOCXXX3G3GC A^GIGOCAGC O^TOQCATC 50 
SSSR TEG ARG KCQP MPS 

AXUICAGAG Cn^QQGIAT CTTIGAOCAT TGAGAGOCA3 GAAGTEAAOT 100 
PSE PRVC LTI ESQ EVNC 

GIUIXXTOGA. CACIGGCX3CA GCCTIUICAG TCITACTTIC CTGTCOCAGA 150 
LLD TGA AFSV LLS CPR 

CAATIGIOCT CCAGATCIGT CACTAPOCGA QQQGICCTAA GAC^GQCACT 200 
QLSS RSV TIR GVLR QPV 

CACTACATAC TICICICAGC CACTAAGTIG TCACIQGGGA /O^ACICT 250 
TTY FSQP LSC DWG TLLF 

TTICACAIGC TITTCTAAIT ATOOCTGAAA GOCXXACTOC CTIGITAGGG 300 
SHA F L I MPES PTP LLG 

AGAGACATTT TA3CAAAAGC AGQQGQCATT ATACAXTGA. ACATAGGAAA 350 
RDIL AKA G A I IHL N I G K 

A3GAATAQ0C ATITGCTCTC CXXTIGCTIGA. GGAA3GAATT AATCCIGAA3 400 
GIP ICCP LLE EGI NPEV 

TCIGGQCAAT A3AAGGACAA TATOGACAAG CAAAGAATOC OCCTCCIGTT 450 
WAI EGQ YGQA KNA RPV 

CAA3TIAAAC TAAAQGATTC TOCC7IGCTIT OOCTADCAAA GGAA3EA00C 500 
QVKL KDS ASF PYQR KYP 



TCITA3AQ0C GAGGCOCTAC AAGGACICAA AAGAL'lUl'lA A3GAOCT 
LRP EALQ GLK RLL RT 
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FIG 45 



0.044 
0.014 



- ( MSRV ) 



LENT1VIRINAE 



Type D 



Type B 



Type A 



ONCOVIRINAE 



Type C 



SPUMAV1RINAE 
(Foamy viruses) 
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FIG 46 



TOCKXJVGCA aSACTOSGOG UXOOXJCDC APGTCCCPCO CCKIGCCKTC 
G ARG KCQP MPS 

ulu jicagac araxmrAT ctttgaccat tgagagocpg gaagttaact 

PSS PRVC L T I ESQ EVNC 



GTCraCTGGA CACTCQCGCA GCTTTCTCAG TCIT ALT I'LL' CTT7ICOCAGA 
L L D TGA AFSV LLS CPR 

CAATTKTCCT GCAGAICTGT CACTATCCGA 0GCETCCT7G GAOGOCAGT 
QLSS RSV TIR GVLG Q P V 

OVCTACA^C TTCICTCAGC CACTAAGTTG TG7CT33GGA ACTTTACICT 
TTY FSQP LSC DWG T L L F 

TTTCACA'iaC Ti u l*lC13ATT AJGOCTGAAA GOQCCJOUZ (_TlUl"iM33 
SKA FLI MPES PTP L L G 



AGflGfiCATTC TAGCAAAAGC ACDGGOTTT ATAZACCTCA 
R D I L A K A G A I IHLN 



AOttAGGAAA 
I G K 



AGGAA^Crr ATTITXTIGTC ULL"1ULTJL>A QGAAGGAATT AASTTTGAAG 
GI? ICCP LLE EGI NPEV 



tctgoocaat /gaagg7Caa tpcpggacjag caaagaatci: cancdGrr 

WAI EGQ YGQA K N A RPV 



CAAGTIAAAC TAAAGGXrTC TOCCTCCTrT CnTAODKAA GGAH3TM0C 
QVKL KDS ASF PYQR KYP 



TCnSGACDC GAGOCX3CTAC AflGGANCICA AAJGATIGTr AAGGAOCTUkA 
LR? EALQ GXQ KIV KDLK 



AAGOLXAA33 GCT7*7D*AA CXATGCSGTA GOGCCIGC7A TACICCAATT 
A Q G LVK PCSS P C N TPI 



T AGGJGTSA □GAAAOXAA CXEAOGTGG AGGTTJCITX: AAGATCTCJC 



G V R KPN 
region A 



GQW RLVQ DLR 



GATOCTAAT GAGGCTGXTT TTUCTCTfTA OOCAGCTOIA. TCUGGOCTT 
I I N EAVF PLY PAV SSPY 

AraCTCTGCr TTaXTAATA CEAGW3GAAG CACAGTOGTCT TAC7GTOCTG 
TLLSLI PEEAEWFTVL 

GAOCTTAAOG ATGOCTTTTr ClGOOaCTT GEFCGTOCH3 ACTCTCAATT 
DLKD AFF CIP VRPD SQF 

CTTGTTTGCC TTTGAACATC CTTTGAACCC AACG1CK3A CTOCCTOGA 
LFA FEDP LNP TSQ LTWT 

CTGTnTAOC CCAAGOGTTC «33GMMCC aGCATCBOT TGOCCAGGC^ 
V L P Q G F RDSP HLF C Q A 

TraGOCAAG ACTTGAGTCA ATTCIOTAC CTQ3AOOC TTCnCCTTCA 
L A Q D LSQ FSY LDTL VLQ 

GTAOGTGGAT GATTTACTTT TAGTQGQQTG TKAGAAAdT TTdGOCMC 
YVD DLLL V A R SET LCHQ 



AAGQC7CGCA AjAACTCTDV ALTTIUL'ILA CDCCTGT3S CEACAAGGIT 
A T Q ELL TFLT TCG YKV 



TGCAAACCAA AOGCTCGGCT CTCC1CSCAG GACATTWStT ACTIAGQ3CT 
SKPK ARL CSQ EIRY LGL 



AAAATIATOC AAAGGOCCA 0330CCTCAG TGJGGAACGT ATOCAQOCTA 



A L S EER IQPI 



TACiax.' ;'iA icciaacGC aaaaogciaa aocaaceaag Acmnurrr 

LAY P HP KTLK Q L R GFL 



GGCTCSAOG GTITL*1U.*< 
G I T G F C 



COG AAAAOCW-T 



CCCAQC7EAOV GGUXKBOC 
P R Y T P I A 
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CACAOCATTA TATAQCTAA TDGQGAA/C TCMAAAGOC AAIACCTATT 
RPL YTLI RET OKA NTYL 

TAGTAAGATC GfCACCDCA GAflGTOaCTT TCOGG00CT AAAGAAG3CC 
VRW TPT EVAF QAL KKA 

CTAAOXAAG OXOGTOT CAUTl'lUULR ACXEGCAAG A TITITLTIT 
LTQA PVF SLP T G Q D FSL 

ATATGOCACA GAAAAAACAG GAATAGCICT AGGAGTCCTT AGOCAGGTCT 
YAT EKTG X A L G V L TQVS 

OVOUMGAG CTTOC^AOZC GT03TATAGC TGK3TAAGGA AAIT3VTGTA 
GMS LOP VVYL SKE IDV 

GTGGCAAAGG GTTGO0CTCA TRITTTAJUJ GTAHTGGOGG: OtfTUGCTGr 
VAKG WPH CLW V M A A VAV 

crTAcrrATcr gaaqc&gtta aaataat?ca GGGAAGAGAT ctd^ctgigt 

LVS EAVK IIQ GRD LTVW 

GSAOMCTCA T3VTGIGMC GOCAITdCA CTOCTAAAGG AGACTTGT3G 
TSH DVN GILT AKG DLW 

TTGTCAQVCA AGCATTI7CT IWCTTXICSG GCICTATD^ TTGAAGAGOC 
LSDN HLL NYQ ALLL EEP 



A3TGCIGAGA CC30XACTT GT3ZAACTCT TAA«XI 30C ACATTTCTTC 
VLR LRTC ATL KP \ TFLP 

CAGACAHTCA AGAAAflGTCA GAACA3MCT GICAACAAGT AATTOCTTChA 
ONE EKI EHNC QQV IAQ 

^CCIMGCTG CICGSOGGGA GTTQCCITGA CTGA3CCXX3A 

TYAA RGD LLE VPLT DPD 

^ #.RK«MH 

CC*TCAA<|rTG TTCTACIGATG GAA3TTCCTT GGC7GAAAAA QGACTTOGAA 
LNIL YTDG SSL AEK GLRK 



AAOCmETA TXTAGIGfCC A33GA3AATC GAMACTTGA AAGTJATOOC 
AGY AVI SONG ILE SNR 



CTC^CKCAG GAACEfiGTDC TGACCTT33CA GAACTAATAG OOCTCACriC 
LTPG TSA HLA E L I A LTW 



CGOVTDGAA TWX3GMG GAAAAAGQ3T AAAIATAEAT TCW3CTCEA 
ALE LGEG K R V N I Y SDSK 



A7EKPGCTTA CETAGTOCTC CWXAATATC GAGAQVGAGG 

YAY LVL HAHA AIW RER 



GftATTOCTAA CZTCZGK33G AACAOCEMC AAOCATCW3G AAGOCATITC 
EFLT SEG TPI NHQE AIR 



GACSmSCTA TEGCTOIAC AGAAAOCTAA AGA03TaX7V GTCTIACACT 
RLL LAVQ KPK EVA VLHC 



CCOCOGTOV TCAGGAAGAA GACCAAA1ZC AAAOAGAAGG CAATCE(XAA 
Q G H QEE EERE IEG NRQ 



QCQGMJtTTC AJGCAAAMA AOCCOCAAOG CMCACTCTC C3CTT3GAAAT 
AOIE AKK A A R QD.SP LEM 
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AIGATOCAGC AQCAQGACNG AGQGIGOOOG GQGCAAGOQC CAGOOCAIGC 50 
MIQQ QDX GCP GQAP AHA 



CATCACXXTC ACAGAGQQOC AQGTAIQCIT GACCATIGAG GGICAGAAGG 100 
ITL TEPQ VCL TIE GQKG 



GINACTGICT OCIGGACACT GQCOC3SDGCT TCICAGICTT A2TITCCIGT 150 
XCL LDT GGAF SVL LSC 



GCTGGACAAC ATCHGICACT GTOCGAGGGG TOCTAGGACA 200 

PGQL SSR SVT VRGV LGQ 



QCCAGTCACT AGATACTIUT CXDC^GQCACT AAGITGTCAC T3QQGAACTT 250 
PVT RYFS Q P L SCD WGTL 



TACHXJITCDC ACATCCTTTT CTAATTATCC CIGAAAGCCC CACICICITG 300 
LFP HAF LIMP ESP TLL 



TTGQQGAGAG ACA3TCTAGC AAAAGCAGQG GCCATTATAC ATCIGAATAT 350 
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